Differently expressed long noncoding RNAs and mRNAs in TK6 cells exposed to low dose hydroquinone

Previous studies have shown that long noncoding RNAs (lncRNAs) were related to human carcinogenesis and might be designated as diagnosis and prognosis biomarkers. Hydroquinone (HQ), as one of the metabolites of benzene, was closely relevant to occupational benzene poisoning and occupational leukemia. Using high-throughput sequencing technology, we investigated differences in lncRNA and mRNA expression profiles between experimental group (HQ 20 μmol/L) and control group (PBS). Compared to control group, a total of 65 lncRNAs and 186 mRNAs were previously identified to be aberrantly expressed more than two fold change in experimental group. To validate the sequencing results, we selected 10 lncRNAs and 10 mRNAs for quantitative real-time PCR (qRT-PCR). Through GO annotation and KEGG pathway analysis, we obtained 3 mainly signaling pathways, including P53 signaling pathway, which plays an important role in tumorigenesis and progression. After that, 25 lncRNAs and 32 mRNAs formed the lncRNA-mRNA co-expression network were implemented to play biological functions of the dysregulated lncRNAs transcripts by regulating gene expression. The lncRNAs target genes prediction provided a new idea for the study of lncRNAs. Finally, we have another important discovery, which is screened out 11 new lncRNAs without annotated. All these results uncovered that lncRNA and mRNA expression profiles in TK6 cells exposed to low dose HQ were different from control group, helping to further study the toxicity mechanisms of HQ and providing a new direction for the therapy of leukemia.


INTRODUCTION
Benzene is recognized as a human carcinogen, which is a widely used occupational harmful factor in industry [1]. The international agency for research on cancer confirms that benzene is a human cause of leukemia, and hematopoietic system is the target organ of benzene toxicity [2]. Hydroquinone (HQ) is an alternative to benzene in vitro. Moreover, a case of acute myeloid leukemia (AML) caused by exposure to HQ for 16 years has been reported recently [3]. Several studies have reported that short-term exposure to HQ induced oxidative DNA damage and apoptosis [4]. Our previous research found that long-term exposure to HQ could result in transforming capability in vitro and the tumorigenesis capability in vivo [5]. Furthermore, our data have shown

Research Paper
Oncotarget 95555 www.impactjournals.com/oncotarget that HQ-induced activation of myeloproliferative leukemia virus oncogene was connected with DNA hypomethylation [6]. Thus it can be seen that HQ poses a great threat to human health in the case of unavoidable contact. Making an intensive study on the mechanisms of HQ toxicity is helpful for the early diagnosis and therapy of leukemia. There is an urgent need to improve early detection and identify new targets for therapy.
Long noncoding RNAs (lncRNAs) are defined as newly-identified class of noncoding RNAs (ncRNAs) longer than 200 nucleotides, which were regarded as transcription noise for lacking of open reading frame (ORF) and non-protein coding function [7][8][9]. Accumulating evidences have indicated that an increased abundance of lncRNAs participating in a variety of biological processes, such as transcription, splicing, translation, chromatin modification, cell growth, cell cycle regulation, apoptosis [10,11,8]. Meanwhile, several authors have proposed that lncRNAs can be used as competing endogenous RNAs (ceRNAs) to competitively sponge miRNAs to regulate target genes [12,13]. With the maturity of the sequencing platform, more and more unknown lncRNAs have entered the academic field of view and attracted the interest of researchers. According to LNCipedia 4.1, the latest version of this long non-coding RNA database contains 146,742 human annotated lncRNAs. Several researchers have revealed that lncRNAs were dysregulated in blood samples of leukemia patients (in vivo) and leukemia cells (in vitro), which involved in hematopoietic system toxicity [14,15]. To some extent, lncRNAs as the biomarkers of disease diagnosis and prognosis have become an integral part of the study. Scholars will uncover the mystery of lncRNAs and tap its little-known functions gradually.
With the depth and development of researches, the study of lncRNAs has gradually shifted from human diseases to environmental exogenous chemicals. In this study, we performed profiles of differentially expressed lncRNAs and mRNAs in 2 pairs of low dose experimental group (HQ 20 μmol/L) and control group (PBS) (repeat 3 times). We also established co-expression networks for differentially expressed lncRNAs and mRNAs, with a view to comprehensively investigating the biological functions of lncRNAs. In addition, we have found several new lncRNAs that have not yet been annotated in major databases, suggesting that there is a possibility of a major breakthrough.

RESULTS
Total RNA purity, concentration and quality identification of samples PBS and HQ (20 μmol/L) treated TK6 cells in a total of two sets (3 replicates per group) in our study. Quality inspection results have shown that the total RNA purity (A260/A280) of all samples was about 2.0 ( Table 1), indicating that there was no protein residue. After the denatured RNA electrophoresis of the total RNA, the bands were clearly observed on the gel imaging system, indicating that the quality of the total RNA was good, and there was no obvious degradation ( Figure 1). To sum up, all total RNA were suitable for further analysis.

Sequencing results validation
On the one hand, in order to validate the sequencing results, on the other hand, for the further in-depth research, we selected 10 differentially expressed lncRNAs and mRNAs of interest for qRT-PCR, respectively. Our results have shown that the changes were statistically different for only 7 of the 10 lncRNAs transcripts and 6 of the 10 mRNAs. Among 7 lncRNAs transcripts (c11orf92, ENSG00000272259.1, LILRP2, LOC344887, LGALS9, HBBP1 and ENSG00000270164.1), 5 lncRNAs transcripts (LILRP2, LOC344887, LGALS9, HBBP1 and ENSG00000270164.1) were same trend as the sequencing results. The sequencing data showed that the expression of C11orf92 and ENSG00000272259.1 were upregulated, while qRT-PCR was verified to be downregulated. LILRP2, LOC344887, LGALS9, HBBP1, ENSG00000270164.1 (LINC01480) were upregulated as shown in the Figure 3A showed that the expression levels of LOC344887 and ENSG00000270164.1 (LINC01480) increased with the increase of HQ concentration after 24h or 48h in TK6 cells exposed to different concentrations HQ ( Figure 3C, 3D).

GO and KEGG pathway analysis
In general, lncRNAs played biological functions by co-expressed mRNAs because they have no ability to encode proteins. Thence, we used two enrichment analyses, GO and KEGG pathway analysis to predict the potential roles of dysregulated mRNAs. GO annotation results revealed 3 structured networks: biological processes (1065 GO Terms), cellular components (115 GO Terms) and molecular function (155 GO Terms). In 186 differentially expressed mRNAs, we selected 32 mRNAs for GO annotation and KEGG pathway analysis. Among 32 differentially expressed mRNAs,  T-20-6.

LncRNA-mRNA co-expression network analysis and lncRNAs target genes prediction
We screened 25 lncRNAs and 32 mRNAs to construct the lncRNA-mRNA co-expression network. We could see that a single lncRNA could be co-expressed with tens of the differentially expressed mRNA, we set a strict standard to build the most reliable lncRNA-mRNA co-expression pairs through the Pearson correlation value and p value ( Figure 5A). In order to gain a better understanding of the biological functions of lncRNAs, we predicted the target genes of the dysregulated lncRNAs by target prediction softs. We selected 27 lncRNAs for target genes prediction, and the results showed that 18 lncRNAs had target genes and 9 had no target genes. Among 18 lncRNAs, 13 lncRNAs had cis target genes, and 5 lncRNAs had trans target genes ( Figure 5B, Table 4).

New lncRNAs prediction
Besides the known lncRNAs, our sequencing results also included 11 new lncRNAs as shown in the following Figure 2: Gene expression profile differences between HQ group compared to PBS group. Volcano plots of the differentially expressed lncRNAs (A) and mRNAs (B). Dark dots represent lncRNAs/mRNAs not significantly differentially expressed (fold change < 2, P > 0.05) and red dots represent lncRNAs/mRNAs significantly differentially expressed (fold change ≥ 2, P < 0.05). Left and right indicate low and high expression, respectively. Hierarchical clustering indicates lncRNAs (C) and mRNAs (D) profiles. Red column indicates up regulation and blue column indicates down regulation, red through blue color indicates high to low expression level. www.impactjournals.com/oncotarget table (Table 5). Same as the known lncRNAs mentioned above, we also constructed the volcanic plot ( Figure  6A) and the heat map ( Figure 6B) of the differentially expressed new lncRNAs. From the table and the figures, we could see that 7 and 4 new lncRNAs are upregulated and downregulated, respectively.

DISCUSSION
Up to this date, many studies have shown that lncRNAs played crucial biological roles in human diseases, such as cancer [16,17], cardiovascular diseases [18] and neuropathic pain [19]. However, there are few studies about lncRNAs involving in exogenous compounds in the environment. Therefore, the study on functions and mechanisms of lncRNAs in this area is limited. In contrast to microarray, the next-generation sequencing techniques were allowed for faster and accurate to gain the differentially expressed profiles of genes. Because it doesn't need to synthesis probes, which sequenced directly [20]. In addition, to our knowledge, no obvious research has focused on the RNA-Seq expression profile of lncRNAs in HQ treatment. In view of the above considerations, we used RNA-Seq techniques to profile lncRNAs in experimental group (HQ 20 μmol/L) and control group (PBS).
Our previous studies about HQ toxicity mechanisms results mainly involved in autophagy [21], cell cycle and apoptosis [22] and malignant transformation [23]. For a more comprehensive understand the toxicity mechanisms of HQ, we treated TK6 cells from the spleen of hereditary red blood cells anemia patients with exogenous compound (20 μmol/L HQ) as the experimental group in this study. TK6 lymphoblastoid cells line express wild type p53 and being a commonly used model for HQ and/ or leukemia researches. Based on the actual level of human contact with HQ and preliminary studies, TK6 cells were treated with HQ 20μmol/L. Our results are the first RNA-Seq analysis indicating 65 lncRNAs and 186 mRNAs dysregulated in HQ treatment with fold changes of two or more. Compared with the control group (PBS), LILRP2, LOC344887, LGALS9, HBBP1 and ENSG00000270164.1 (LINC01480) are upregulated and C11orf92, LAMB2P1, LOC63930, SLC7A11-AS1, ENSG00000272259.1 are low expression in RNA-Seq data. We verified that the up-regulation of lncRNAs trends was consistent with sequencing results, while the down-regulation of lncRNAs trends was different from sequencing results. The expression trends of C11orf92 and ENSG00000272259.1 were contrary to sequencing results and the changes were statistically significant. Furthermore,   Figure 3C, 3D, the results showed that the expressions of LOC344887 and ENSG00000270164.1 (LINC01480) increased with the increase of HQ concentration after 24h or 48h in HQ treated TK6 cells at different concentrations. Wu et al. [24] has demonstrated that LOC344887 was amplified in gallbladder cancer and promoted epithelial mesenchymal transition (EMT) . Silencing of LOC344887 via CRISPR/Cas9 genome-editing protected against sulforaphane (SFN) -mediated inhibition of cancer cell growth, colony formation, and migration [25]. There is no literature reported ENSG00000270164.1 (LINC01480) function. Therefore, we can make a hypothesis that LOC344887 and ENSG00000270164.1 (LINC01480) may be involved in HQ induced TK6 cytotoxicity mechanisms. The specific mechanisms or signaling pathways need further exploration, and are also the focus of our next research. With the advantages of high signal-to-noise ratio, high resolution and wide application range, the new generation sequencing technology also has some errors from sample contamination and sequencing errors [26]. It is worth mentioning that lncRNAs can serve as signal, decoy, guide or scaffold molecules to participate in pre-transcriptional regulation, transcriptional regulation and post-translational regulation [27,26,11]. LncRNAs usually regulate target genes to perform their biological function because of lacking of proteins encode ability. Thus, a comprehensive understanding of the potential function for differentially expressed mRNAs is essential. Through GO annotation and KEGG pathway analysis, we could identify potential functions of the dysregulated mRNAs. GO annotation [28] revealed that the vast majority of differentially expressed mRNAs involve in multiple biological processes. IGJ and AMBP involved in cellular components. Simultaneously, partial mRNAs were involved in molecular functions, such as p53 binding, monosaccharide binding and fucose binding. KEGG database [29] revealed that the dysregulated mRNAs mainly involve in HQ toxicity mechanism through p53 signaling pathway and Glycosphingolipid biosynthesis Oncotarget 95560 www.impactjournals.com/oncotarget -globo series pathway. Among them, chronic myeloid leukemia may be associated with MDM2 abnormalities. As the target gene of p53, MDM2 played an important role in the ubiquitin-proteasome pathway [30,31]. Surprisingly, the combination of lincRNa-p21 and MDM2 promoted p53 ubiquitination, that is to say lincRNa-p21 positively regulated MDM-p53 interaction [32]. Studying the potential function of the differentially expressed mRNAs, can help to gain insight into their involvement in signaling pathways and provide convenience for researchers in the study of lncRNAs, simultaneously.
With the maturity of sequencing technology, more and more lncRNAs have been discovered and annotated, but their functional researches are still limited. Due to considerable difficulties in making a thorough inquiry on the lncRNAs functions directly, it is particularly important to predict lncRNAs target genes and construct lncRNA-mRNA co-expression network according to a coexpression-based method [33,18]. LncRNAs regulated the expression of co-expressed mRNAs mainly through cis and trans. For instance, lncRNA-HBBP1 (NR_001589) regulated the expression of HBG1, HBG2, HBD by cis regulation and lncRNA-PHLDA3 (NR_073080) trans regulated the expression of CYP20A1 and ZNF470. The HBG1 and HBG2 are normally expressed in the fetal liver, spleen and bone marrow. Compared with wild type promoter, the activity of HBG1 gene promoter variant decreased significantly in K562 cells under conditions of erythropoietic stress [34]. This finding is helpful to analyze the pathogenesis of β-Thalassemia and provide a   reference for its treatment. Because HBG1 and HBG2 are involved in the pathogenesis of hematologic diseases, it is not difficult to imagine that lncRNA-HBBP1 may also be involved in hematologic diseases and even leukemia. In view of the long-term exposure to HQ could lead to blood system diseases, then this coincide with our initial hypothesis. Another advantage of RNA-Seq technology is the ability to filter out new lncRNAs without annotated. We have identified 11 new lncRNAs, which will provide a new breakthrough for the studies of HQ toxicity mechanisms and leukemia pathogenesis.
More and more literatures have proved that lncRNAs were closely associated with the pathogenesis of leukemia. We have made a review about the lncRNAs as the novel diagnostic biomarkers of leukemia. For instance, MEG3, RUNXOR, NEAT1, LLEST, IRAIN and UCA1 were related to AML; ANRIL, LUNAR1 and T-ALL-R-LncR1 were involved in ALL; LncRNA-BGL3, H19  Oncotarget 95563 www.impactjournals.com/oncotarget and MEG3 participated in CML; lincRNA-p21, DLEU1/ DLEU2 and TRERNA1 were involved in CLL [35]. However, the functions of more lncRNAs we sequenced by RNA-Seq has not been studied. Next, we plan to take the blood samples from patients with occupational leukemia for sequencing in order to find the lncRNAs consistent with this paper.

MATERIALS AND METHODS
Cell culture and HQ treatment TK6 cells are kindly provided by Professor Lishi Zhang of West China Medical College of Sichuan University, and were cultured at 37°C in an atmosphere containing 5% CO 2 in RPMI1640 medium (Gibco Invitrogen, UK) plus 10% horse serum (HS). HQ (purity > 99%, Sigma, St. Louis, MO) was dissolved in phosphatebuffered saline (PBS) and immediately added to TK6 cells (20 μmol/L), the cells were cultured for 48h.

RNA isolation and qRT-PCR
Total RNA of cells was isolated using Trizol reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer's instructions. The RNA concentrations were quantified by NANODROP 2000c (Thermo Scientific, USA), and the quality of RNA was assessed by 2% agarose gel electrophoresis. Extracted total RNA was used to synthesize cDNA using Transcriptor First Strand cDNA Synthesis Kit (Roche, Mannheim, Germany). qRT-PCR reactions were performed in triplicate in a 96well plate containing 1μl of synthesized cDNA (diluted 6 times), FastStart Universal SYBR Green Master (Roche, Mannheim, Germany) on PikoReal (Thermo Scientific, USA) in a total volume of 10μL. The expression levels of lncRNAs were normalized to GAPDH and calculated using the 2 −ΔΔCt method. All of primers were designed and synthesized by Generay Biotechnology (Generay Biotechnology, Shanghai) ( Table 6).

High-throughput sequencing and data set
Total RNA was isolated from cells/tissues using the Trizol (invitrogen) according to the manufacturer's protocol. RNA purity was assessed using the Qubit ® . Each RNA sample had an A260:A280 ratio above 1.8 and A260:A230 ratio above 2.0. RNA integrity was evaluated using the Agilent 2200 TapeStation (Agilent Technologies, USA) and each sample had the RINe above 7.0. Briefly, rRNAs were removed from Total RNA using Epicentre Ribo-Zero rRNA Removal Kit (illumina, USA) and fragmented to approximately 200bp. Subsequently, the purified RNAs were subjected to first strand and second strand cDNA synthesis following by adaptor ligation and enrichment with a low-cycle according to instructions of TruSeq® RNA LT/HT Sample Prep Kit (Illumina, USA). The purified library products were evaluated using the Agilent 2200 TapeStation and Qubit ® 2.0 (Life Technologies, USA) and then diluted to 10 pM for cluster generation in situ on the HiSeq3000 pair-end flow cell followed by sequencing (2 × 150 bp) on HiSeq 3000. Finally, the obtained data were analyzed by bioinformatics. All RNA-Seq data was aligned to hg19 using TopHa v1.4 [36] with default parameters. We used Cuffdiff v1.3 [37] to analyze differentially expressed genes and obtained custom annotations composed of RefSeq entries plus HQinduced TK6 lncRNAs as described below. The difference was considered statistically significant when P < 0.05. To screen the differentially expressed genes, we used threshold values of ≥ 2 (up-regulation) and ≤ -2 (downregulation) fold change (FC). FC refers to the ratio of FPKM between the experimental group and the control group. FPKM is a general method of gene expression levels, which stands for the fragments Per Kilobase of transcript Per Million mapped reads.

Hierarchical clustering analysis
Cluster analysis of differential genes was used to determine the clustering patterns of regulation patterns under different experimental conditions. The difference in FPKM for different samples is used for cluster analysis.

Gene function analysis
The functional annotation of lncRNAs is mainly predicted by their co-expression mRNAs, but there is still no functional annotation of most lncRNAs at present [38]. GO (Gene Ontology) [39] and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway [40] annotations are useful for analyzing differentially expressed lncRNAs related mRNAs functions. GO analysis is mainly composed of cellular components, biological processes and molecular functions. Pathways about the mRNAs associated with dysregulation of lncRNAs can be obtained by KEGG database besides.

LncRNA-mRNA co-expression network analysis
The co-expression network was constructed between significantly dysregulated lncRNAs and mRNAs. Considering the influence of random factors, we found that only using Pearson correlation coefficients (PCC) is not strict, so we calculate zscore values and then obtain P values. The co-expressed lncRNAs and mRNAs pairs with |COR| > 0.85 and P < 0.05 were selected to draw the network. By selecting specific lncRNAs from dysregulated lncRNAs, we mapped the network graphs that interact with their target genes, providing a reference and help www.impactjournals.com/oncotarget for the holistic analysis the function of lncRNAs in the samples.

LncRNAs target genes prediction
LncRNAs play the biological functions mainly by regulating their target genes because they lacked protein coding ability [41]. The target genes here are also known as transcription factors (TF). We mainly use TFSearch software (http://www.cbrc.jp/research/db/TFSEARCH. html) to predict TF of lncRNAs, and we draw lncRNA-mRNA interactive analysis diagrams through Cytoscape. Moreover, the mechanisms of lncRNAs are similar to miRNAs and can be performed by regulating the corresponding mRNAs. LncRNAs regulate target genes mainly through cis and trans regulation. The principle of cis regulation mechanism: for the differentially expressed lncRNAs, search all the coding genes of their upstream and downstream 10 kb range, whichever is selected with a significant co-expression of the lncRNA. The principle of trans-action is mainly based on nucleic acid base pairing.

New lncRNAs prediction
With the strict definition of lncRNAs, we can screen out new lncRNAs by setting some filtering conditions. The filter conditions are as follows: 1) Filtering database known genes and lncRNAs; 2) FPKM > 2; 3) ORF < 300; 4) Filtering RNAs with a protein domain (Pfam); 5) RNAs with a length greater than 200 nt; 6) Filtering RNAs with a coding potential [42]. An ANNOVAR software (http:// www.openbioinformatics.org/annovar/) built-in database is adopted to annotate new lncRNAs.

Statistical analysis
All experiments were repeated at least 3 times independently, and all values were presented as mean ± standard deviation (SD). The data of two experimental groups were compared using the Student's t-test, while One-way ANOVA was analyzed for multi-group comparisons. All data were analyzed and plotted using GraphPad Prism 6 (Graph-Pad Software, La Jolla, CA), and a P value of 0.05 or less was considered statistically significant, * means P < 0.05.

CONCLUSIONS
Overall, our study confirm that lncRNAs might participate in HQ toxicity mechanism even play an important role in leukemia. Further studies will be needed to persuasively demonstrate and interpret the more accurate role of one or several specific lncRNAs in TK6 cells exposed to low dose HQ. These findings bring us one step closer to a better understanding of leukemia pathogenesis, and lncRNAs may be the novel biomarkers for the diagnosis and therapy of leukemia.

Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.