Exploring functions of long non-coding RNAs in HIV-associated neurocognitive disorder through weighted gene co-expression network analysis

HIV-associated neurocognitive disorder (HAND) remains an unresolved issue of cognitive impairment. Due to treatment with combination antiretroviral therapy for nearly two decades, the severe form of HAND is rare, and most cases of HAND are the milder form. However, the proportion of HIV+ individuals with neurocognitive impairment has remained unchanged. Increasing evidence indicates that long noncoding RNAs (lncRNAs) play important roles in the pathogenesis of several types of neurodegenerative disorders. However, there are few relevant reports about the role of lncRNAs in promoting or delaying pathogenesis of HAND. Our study is the first to use the weighted gene co-expression network analysis approach with microarray data from the Gene Expression Omnibus to investigate the functions of lncRNAs that differ significantly between HIV+ patients without neurocognitive impairment and HIV+ patients with neurocognitive disorders. A total of 28 differentially expressed lncRNAs (DE-lncRNAs) were identified including 13 DE-lncRNAs in frontal white matter, 13 DE-lncRNAs in basal ganglia, and 2 DE-lncRNAs in frontal neocortex. Most of these DE-lncRNAs had not been previously reported to be related to HAND. Our coexpression network and function enrichment analysis indicate that DE-lncRNAs should play fundamental roles in HAND development and neurodegeneration associated with synaptic plasticity and synaptogenesis, neuronal inflammation and apoptosis, neurotransmitter transport and metabolism. Although the biological significance of DE-lncRNAs requires further evaluation, our study proposes a simple and efficient strategy to identify important lncRNAs associated with HAND and predict their potential functional roles, which may guide subsequent experimental studies. www.impactjournals.com/oncotarget/ Vol.9, (No.1), Supplement 1, pp: s1302-s1316


INTRODUCTION
For almost a decade, neurocognitive dysfunction associated with HIV infection has been referred to as HIV-associated neurocognitive disorder (HAND), which represents a set of neurocognitive impairments (NCIs) that includes asymptomatic neurocognitive impairment (ANI), mild neurocognitive disorder (MND), and HIV-associated dementia (HAD) [1].Combination antiretroviral therapy (CART) is currently the only valid treatment to delay or prevent progression of HAND, but it is only effective for a subset of patients [1].Because CART is becoming more widely distributed and clearly improving survival time of HIV+ patients, the long-term global impact of HAND will increase.Worldwide, HAND remains a poorly understood cause of cognitive impairment and has persisted in individuals who have received CART [2,3].
In the pre-CART era, the most common form of HAND was HAD, which was almost irreversibly lethal [4,5].Fortunately, the prevalence of HAD has significantly decreased with increased use of CART.Before 1991, 20% of participants included in the Multicenter AIDS Cohort Study met the criteria for HAD, but in 2001-2003, only 5% met the criteria [6,7].As a result, HAD, which is the most severe form of HAND, is now rare.Instead, the prevalence of milder forms of HAND, such as ANI, has increased and accounts for up to 70% of all forms of HAND [8].However, since the introduction of CART in 1996, the proportion of HIV+ individuals with NCI has remained unchanged.HAND remains a major cause of morbidity; 15-55% of HIV+ individuals are estimated to have HAND, which is similar to that in the pre-CART era [2,7].This overlooked issue in which the proportion of HIV+ individuals with NCI has not changed with CART suggests possible innate mechanisms involved in the central nervous system (CNS) of HIV+ individuals who are resistant to impairment until the end stage of AIDS.
In the last decade, an increasing number of studies have revealed that the human genome encodes tens of thousands of long non-coding RNAs (lncRNAs) [8].Furthermore, abundant tissue-specific lncRNAs are expressed in the brain with precisely regulated temporal and spatial expression patterns and mediate developmental complexity, cellular diversity, and activitydependent plasticity [9,10].Dysregulation or mutations in lncRNA gene loci have been linked to pathogenesis of CNS-associated neurodegenerative disorders such as Parkinson's disease, Alzheimer's disease, Huntington's disease, and amyotrophic lateral sclerosis (ALS) [11,12].Therefore, it is not surprising that dysregulation of lncRNA genes has been implicated in promoting or delaying development of HAND even though there are few relevant studies.
Traditional experimental approaches to define functions of lncRNAs, such as gene knockdown, overexpression, or editing, are limited due to an extensive pool of candidates [13].Computational analysis to utilize genome-wide expression profiles is a rapidly developing method to explore functions of lncRNAs by network construction based on the idea that co-expressed transcripts are more likely to be co-regulated and involved in similar functions [14].Consequently, putative functions can be assigned to lncRNAs by identifying protein coding genes with which they are co-expressed [13].
In this study, we utilized a computational strategy of weighted gene co-expression network analysis (WGCNA) [15] and microarray data downloaded from the Gene Expression Omnibus (GEO) to explore functions of differentially expressed lncRNAs (DE-lncRNAs) that differed significantly between HIV+ patients without CNS impairment and HAND patients with neurocognitive disorders.Three brain regions were examined: the frontal neocortex (FC), the basal ganglia (BG), and frontal white matter (WM).A total of 28 DE-lncRNAs were identified including 13 lncRNAs in WM, 13 lncRNAs in the BG, and 2 lncRNAs in the FC.Most of these lncRNAs had not been reported previously to be related to HAND.Then, WGCNA was used to construct co-expression networks, and gene co-expression modules combined with the Database for Annotation, Visualization, and Integrated Discovery (DAVID) were used to explore biological functions.Our study contributes to a comprehensive understanding of DE-lncRNAs, which may guide subsequent experimental studies on the roles of DE-lncRNAs in HAND pathogenesis.

Differentially expressed lncRNA genes
We defined DE-lncRNAs in the brain samples of HIV+ people without NCI and HAND patients with NCI without HIVE based on the following criteria: fold change ≥ 2 and P ≤ 0.05.Three brain regions were examined: FC, BG, and WM.Finally, we obtained 36 probe sets (Figure 1) that represented 28 lncRNAs including 13 lncRNAs in WM, 13 lncRNAs in the BG, and 2 lncRNAs in the FC (Figure 2).In WM, H19 was the only upregulated lncRNA in patients without NCI.H19 has been reported to play a regulatory role in promoting glioma cell invasion by inducing miR-675 expression [16] and promoting angiogenesis, stemness, and drug resistance of glioblastomas [17,18].In addition, H19 is associated with neural tube defects [19].PRKCQ-AS1 has been reported to be involved in oligodendrocyte differentiation and CNS myelination [20].ELOVL1 was demonstrated as a potential target in treatment of X-ALD (an inherited neurodegenerative disorder) [21].IPW was linked with Prader-Willi syndrome, which is characterized by mild to moderate intellectual impairment and behavioral problems [22].In the BG, all DE-lncRNAs were upregulated in the HAND patients with NCI.TUG1, MALAT1, NEAT1, and MEG3 have been associated with brain development, functional diversification, and neurodegenerative diseases.TUG1 and NEAT1 were found to be upregulated in the brains of patients with Huntington's disease (an inherited disorder that results in death of brain cells) [23].MALAT1 is expressed at high levels in neurons and low levels in glia and astrocytes of the brain.Moreover, knockdown of MALAT1 decreases synaptic density, and overexpression of MALAT1 leads to cell-autonomous upregulation in synaptogenesis [24].MEG3 was found to be highly enriched in neuronal nuclei in gray matter and only in glia in white matter [25].In addition, MEG3 was reported to be a brain-specific tumor suppressor that suppresses cell growth, promotes p53-mediated apoptosis, and is lost in pituitary tumors [26,27], gliomas [28], and meningiomas [29].

Co-expression network construction
The co-expression network of DE-lncRNAs and differentially expressed mRNAs was constructed by the WGCNA in the 3 brain regions, WM, BG, and FC.In WM, WGCNA was performed on 2565 differentially expressed probes including 13 probes of lncRNAs from 13 samples.After discarding 3 outlier samples, the connectivity between genes in the gene network met a scale-free network distribution when the soft threshold power β was set to 12. Highly similar modules were identified by clustering and then merged with a height cut-off of 0.25.Then, 16 co-expressed modules ranging in size from 40 to 547 genes (assigning each module a color for reference) were identified (Figure 3A).In the BG, WGCNA was performed on 2779 differentially expressed probes including 17 probes of lncRNAs from 13 samples.After 2 outlier samples were discarded and the soft threshold power beta was set to 13, highly similar modules were merged with a cut-off of 0.25.Finally, 8 co-expressed modules ranging in size from 85 to 932 genes were established (Figure 3B).In the FC, WGCNA was performed on 2067 differentially expressed probes including 4 probes of lncRNAs from 13 samples.Two outlier samples were discarded, and then the soft threshold power beta was set to 15.Similar modules were merged using a cut-off value of 0.25, and 8 co-expressed modules (ranging in size from 80 to 824 genes) were constructed (Figure 3C).All the "grey" modules were reserved for genes identified as not co-expressed.

Finding modules of interest and functional annotation
It is of great biological significance to identify modules significantly associated with clinical features.The NCI score of HIV+ patients was selected as the   clinical feature of interest to study the relationships with the identified modules.The modules were also examined for association with age, race, sample freeze time, and ADV exposure.In WM, only the yellow module had a positive significant relationship with impairment score (R = 0.76, P = 0.01).In addition, 2 modules of black (R = −0.72,P = 0.02) and dark turquoise (R = −0.74,P = 0.01) were significantly negatively related with impairment score (Figure 4A).In the BG, 3 modules of light yellow (R = 0.72, P = 0.01), dark green (R = 0.68, P = 0.02), cyan (R = 0.65, P = 0.03) were positively correlated with impairment score.Furthermore, only 1 module of dark red was negatively correlated with impairment score (R = −0.64,P = 0.03) (Figure 4B).In the FC, 5 modules of magenta (R = −0.86,P = 0.0006), green (R = −0.81,P = 0.002), tan (R = −0.77,P = 0.006), midnight blue (R = −0.65,P = 0.03), and blue (R = −0.62,P = 0.04) were negatively correlated with impairment score.In addition, only the turquoise module (R = 0.64, P = 0.03) had a positive relationship with impairment score (Figure 4C).
To determine functions of DE-lncRNAs with the corresponding modules, we listed 18 modules (2 modules in the FC, 9 modules in WM, 7 modules in the BG) containing the DE-lncRNAs for downstream analysis in Table 1.All genes in the selected modules were mapped into the DAVID database and subjected to GO BP terms and KEGG pathways.Supplementary listed all significantly enriched GO BP terms (P ≤ 0.05) for the 18 modules (Supplementary Table 2).Figure 5 shows 3 representative terms for each module.Table 2 lists the significant KEGG pathways with P ≤ 0.05 and gene counts ≥ 2 of 6 modules that were significantly correlated with NCI score.Supplementary listed all significant KEGG pathways with P ≤ 0.05 and gene counts ≥ 2 for the 18 modules (Supplementary Table 3).

Modules of interest in WM
The yellow module, which contained 518 genes including CRHR1-IT1 and was positively correlated with NCI, showed functional enrichment in multicellular organism growth, calcium ion transmembrane transport, and protein phosphorylation (Figure 5B, Supplementary Table S2).Pathway analysis further revealed that the genes in the module were enriched in certain signal transduction pathways such as calcium, GnRH, MAPK, neurotrophin, and oxytocin signaling pathways (Table 2, Supplementary Table 3).For the black module with 379 genes and containing AGPAT4-IT1, the most common GO BP term was regulation of transcription, DNA-templated (Figure 5B, Supplementary Table 2).Pathway analysis of black module revealed that the genes were enriched in similar signal transduction pathways such as calcium, GnRH, and oxytocin signaling pathways in the yellow module and in the typical signal transduction pathways of insulin, estrogen, and glucagon (Table 2, Supplementary Table 3).The genes of both modules were also enriched in metabolic pathways, such as aldosterone synthesis and secretion, as well as adhesion pathways such as focal adhesion.

Modules of interest in the BG
The dark green module of 157 genes and including NEAT1 showed functional enrichment in signal transduction, negative regulation of transcription initiation from RNA polymerase II promoter, and regulation   5A, Supplementary Table 2).However, there was no significant pathway in KEGG pathway analysis.The cyan module with 85 genes and containing MEG3 and MALAT1 showed functional enrichment only in vesicle fusion (Figure 5A, Supplementary Table 2), and pathway analysis indicated only the thyroid hormone signaling pathway (Table 2, Supplementary Table 3).The dark red module containing 206 genes that included UG0898H09 was enriched in regulation of transcription from RNA polymerase II promoter, cellular response to brain-derived neurotrophic factor stimulus, and aging (Figure 5A, Supplementary Table 2).Pathway analysis revealed that the genes in the module were enriched in the thyroid hormone signaling pathway, pancreatic secretion, Huntington's disease, and the AMPK signaling pathway (Table 2, Supplementary Table 3).The light yellow module contained 392 genes including GP1BB, LINC00982, and ATP6V1G2-DDX39B and was enriched in GO BP terms of SRP-dependent co-translational protein targeting to membrane as well as viral transcription and translational initiation (Figure 5A, Supplementary Table 2).Pathway analysis revealed that the genes were enriched in ribosome (Table 2, Supplementary Table 3).

Modules of interest in the FC
The magenta module with 80 genes including DCTN1-AS1 showed functional enrichment in extracellular matrix disassembly, heart development, and positive regulation of gene expression (Figure 5C, Supplementary Table 2).Pathway analysis revealed that the genes in the module were enriched in certain signal transduction pathways such as the PI3K-Akt, mTOR, and TGF-beta signaling pathways (Table 2, Supplementary Table 3).

Hub-based analysis and visualization of coexpression networks
We further identified the intramodular hub genes of the 18 modules that contained DE-lncRNAs.We selected the 10% of nodes with the highest connectivity as hub genes from the selected modules.Finally, we found the 6 modules in which DE-lncRNAs were characterized as the hub genes (Table 3).Moreover, the yellow module in WM and the cyan module in the BG, which significantly correlated with impairment score, identified DE-lncRNAs CRHR1-IT1 and MEG3 as hub genes, respectively.Figure 6A presents a network of 85 genes in the cyan module in the BG; 2 probes annotated to MEG3, and 5 probes displayed the highest connectivity as hub genes with different colors.We also displayed the network of 51 hub genes in the yellow module in WM since there were many genes (518 genes) in this module (Figure 6B).CRHR1-IT1 was the only DE-lncRNA of the hub genes with high connectivity.

DISCUSSION
HAND, which was firstly named AIDS dementia complex (ADC) when recognized in the 1980s and was observed as diffuse myelin pallor in the cerebral WM at that stage [29,30].As is recognized now, HIV acts as a Trojan horse, plunging into CNS via infected monocytes to infect microglial cells but not neurons [31].The virus is more gathered in WM and BG than in the cortex after entering the brain [31,32].Cortical neuronal loss may occur late in the cortex, but WM pallor and gliosis are more happened initially [31,33].Imaging surveys of patients with HAND also indicated the frontal WM is affected earliest, followed by the BG and then the frontal gray matter [34].So far, CART is the only treatment to prevent or delay progression of HAND, but it is not broadly effective.After use of CART for almost 2 decades, the proportion of HIV+ individuals with neurocognitive disorders has remained unchanged.This suggests that there is an unknown mechanism involved in the CNS of HIV+ individuals who are resistant to impairment until the end stage of AIDS.We utilized a computational strategy of WGCNA and microarray data downloaded from GEO to perform a systematic study of lncRNAs that differ significantly between HIV+ patients without CNS impairment and HAND patients with neurocognitive disorders.Three brain regions were examined: the FC, the BG, and WM.A total of 28 DE-lncRNAs were identified, and most of them have not been reported previously to be related to HAND.Then, we used WGCNA to construct a co-expression network of DE-lncRNAs with differentially expressed mRNAs to explore their roles in preventing or promoting development of HAND.
In brain regions of WM, CRHR1-IT1, which was a hub gene with high connectivity in the yellow module, was significantly and positively correlated with NCI score and upregulated in HIV+ patients with NCI.CRHR1-IT1 is a spliced transcript variant of CRHR1.The protein CRHR1 is a critical regulator of the neuroendocrine, behavioral, and autonomic stress response [35] In addition, a recent report indicated that CRHR1 is involved in altering brain plasticity and neuronal inflammation associated neuroprotection and cognitive recovery following ischemic brain insults [36].And, the enhanced CRHR1-IT1 expression in the WM of HAND may exert negative  RARA encodes a protein, which is retinoic acid receptor alpha (RARA).Retinoids are important for maintenance of the adult nervous system and is an important regulator of brain vascular development by WNT signaling [37].
The comparison of expressions of retinoic acid receptors between normal and schizophrenia brain sections indicated a two-fold increase of RARA in the schizophrenia sections [38].Disruption of the retinoid signal pathway also appears to increase the amount of amyloid β protein in the brain, one of the major components involved in amyloid plaque formation [36], which was also found in the brain of HAND patients [38].The upregulated expression of RARA in patients with HAND as schizophrenia can disrupt the normal signal pathway of retinoid associated with amyloid β protein in the brain resulted in the cognitive disorder of HIV+ patients.Another hub gene in brain regions of WM named ELVOL1, elongation of very long chain fatty acids-like 1, in the dark grey module.ELOVL1 targets very long chain fatty acids (VLCFA), the expression of ELOVL1catalyzed the synthesis of VLCFA and the accumulation of VLCFA in cerebral WM was reported leading to a neuroinflammatory disease process associated with neurodegeneration [39].Therefore, the enhanced ELVOL1 expression in WM of HIV+ patients can induce the accumulation of VLCFA resulting in neurodegeneration.In addition, DIP2B mRNA was most highly connected to ELVOL1 according to the WGCNA results.DIP2B, a brain-expressed gene, is located near one of folate-sensitive fragile sites, FRA12A.The increased levels of DIP2B mRNA, especially CGGrepeat expansion in the DIP2B gene is associated with the fragile site FRA12A on chromosome 12q13.1 and a high level of cytogenetic expression of FRA12A is significantly associated with mental retardation [40].Hence, the higher DIP2B expression in WM of HIV+ patients may mediate the neurocognitive problems associated with FRA12A.
In brain regions of BG, MEG3, maternally expressed gene 3, was the hub gene in the cyan module.This module was positively and significantly associated with NCI score.MEG3 was found to be highly enriched in neuronal nuclei in gray matter and only in glia in white matter [25].Twelve alternatively spliced transcript variants are transcribed from this gene, and all of them are lncRNAs.In addition, MEG3 has been reported to be a brain-specific tumor suppressor promotes p53-mediated apoptosis and is lost in pituitary tumors [26,27], gliomas [28], and meningiomas [29].Furthermore, Synaptic plasticity, has long been postulated as a cellular mechanism of learning and memory [41] and characterized by an increase in the size of dendritic spines and the number of synaptic AMPA-type glutamate receptor (AMPAR).However, the knockdown of Meg3 enhanced surface AMPAR expression under basal conditions in primary cortical neurons [42].Therefore, the improved Meg3 expression in BG may interfere the normal process of synaptic plasticity of HAND patients to decrease the patients' ability of learning and memory.MEG3 was connected with 111 genes from the WGCNA results (w ≥ 0.05).Of these, 3 protein coding genes (N PEPL1, EEA1, STX6) are involved in vesicle fusion and docking.These biological processes are linked with neurotransmitter release and uptake at the synapse were also associated with synaptic plasticity.MALAT1, metastasis-associated lung adenocarcinoma transcript 1, was also in the cyan module in the BG, is expressed at high levels in neurons and low levels in glia and astrocytes of the brain.Although, it was not the hub one, MALAT1 was highly upregulated in HIV+ patients with NCI.Moreover, MALAT1 lncRNA was reported to modulate synapse formation in neurons by regulating the expression of genes involved in synaptogenesis.The knockdown of MALAT1 decreases synaptic density, and overexpression of MALAT1 leads to cell-autonomous upregulation in synaptogenesis [24].In HIV+ patients with NCI, the higher expression of MALAT1 associated with synaptogenesis may be one of the causes for neurocognitive disorder.TUG1, taurine up-regulated 1, was the hub gene in the light green module in brain regions of BG.TUG1 lncRNA is mainly expressed in the developing retina and brain and is associated with neurodegenerative diseases [23].The knockdown of TUG1 resulted in upregulation of cell-cycle genes [43] and TUG1 was shown to be a direct downstream target of p53 [44].Thus, TUG1 would appear to be a pro-apoptosis factor in neurons, and p53 was upregulated in neurons was known in HAND patients [44].Thus, the upregulation of TUG1 possibly induced by p53 activation we observe in HIV+ patients with NCI may in fact lead to the apoptosis of neurons.Moreover, TUG1 sponges microRNA-9 to promote neuronal apoptosis in ischemia [45] and serves as a micro-RNA-26a sponge in human glioma cells to play a tumor-suppressive role [46] were also identified the pro-apoptosis role of TUG1.A total of 53 hub genes were connected highly with TUG1 in this module.In addition, 3 protein coding genes (TACC1, LRP6, NTRK2) are involved in the biological process of cerebral cortex development with the most significance (the lowest P value) with GO term analysis.NTRK2, which encodes the protein neurotrophic tyrosine receptor kinase, has been reported to play an important role in modulation of the brain to stress by a neuroplastic pathway  and is associated with multiple psychiatric disorders [47].Thus, the role of upregulated NTRK2 expression in HIV+ patients may be link with neurocognitive impairment.NEAT1, nuclear paraspeckle assembly transcript 1, belongs to the dark green module in brain regions of BG.This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles and acts as a transcriptional regulator [48].NEAT1 has been reported to be upregulated in the brains of patients with Huntington's disease (an inherited disorder that results in death of brain cells) and also contains experimentally detected p53 binding sites as TUG1 [23].NEAT1 was shown to play a role in HIV-1 replication to enhance virus production through increased nucleus-tocytoplasm export of Rev-dependent instability element (INS)-containing HIV-1 mRNAs [49].Therefore, in HIV+ patients with NCI, the enhanced expression of lncRNA NEAT1 could enhance the HIV-induced neurotoxicity and also lead to the apoptosis of neurons as the role of TUG1.

Module
DCTN1-AS1, DCTN1 antisense RNA 1, was in the magenta module in brain regions of FC that was correlated with NCI score, and the function is unknown.However, DCTN1, which encodes dynactin 1, has been linked to slowly progressing motor neuron diseases including Amyotrophic Lateral Sclerosis (ALS), ALS with Frontotemporal Dementia, and Perry Syndrome [50].Mutation of DCTN1 is thought to cause neuronal dysfunction by disruption of axonal transport and may play a crucial role in development of neurodegeneration [51].The higher levels of DCTN1-AS1 expression in FC of HIV+ patients with NCI may interfere normal DCTN1 expression and function associated with neurocognitive impairment.
Our current knowledge of lncRNAs has grown exponentially over the last few years.It has been shown that lncRNAs are expressed in a highly cell-and tissuespecific manner, and they are particularly abundant within the nervous system.Accumulating numbers of studies have indicated that lncRNAs play important roles in the pathogenesis of neurological and psychiatric disorders.However, there are few relevant reports about the roles of lncRNAs in promoting or delaying the pathogenesis of HAND.Our study is the first to use the WGCNA approach to investigate functions of lncRNAs that differ significantly between HIV+ patients without CNS impairment and HAND patients with neurocognitive disorders.HIV acts as a Trojan horse, gathered in WM and BG firstly than in the cortex after entering the brain.WM is affected earliest and gliosis are more happened initially, followed by the BG and then the frontal gray matter.Cortical neuronal loss may occur late in the cortex.A total of 28 DE-lncRNAs were identified including 13 lncRNAs in the WM, 13 lncRNAs in the BG, and 2 lncRNAs in the FC.The different distribution in the three regions of brain is in consistence with the process of HIV entering to the brain.Most of these lncRNAs have not been reported to be related to HAND.Our co-expression network and function enrichment analysis indicate that the DE-lncRNAs should play fundamental roles in HAND development and neurodegeneration associated with synaptic plasticity and synaptogenesis, neuronal inflammation and apoptosis, neurotransmitter transport and metabolism.Although the samples size was small (6 for HIV+ patients without NCI and 7 for HAND patients with NCI) due to the samples of WM, BG and FC from HIV patients' brain were extremly valuable and the biological significance of the DE-lncRNAs requires further evaluation.Our study proposes a simple and efficient strategy to identify important lncRNAs associated with HAND and predict their potential functional roles, which may guide subsequent experimental studies.

Gene expression array data and identification of lncRNAs
The microarray data were generated using the Affymetrix Human Genome U133 Plus 2.0 array (Affymetrix, Santa Clara, CA, USA) and were deposited under accession number GSE35864 in the GEO database (http://www.ncbi.nlm.nih.gov/geo), which is a public functional genomics data repository.Twenty-four human subjects were included and divided into 4 groups: A) 6 uninfected controls, B) 6 HIV-1 infected subjects without substantial NCI, C) 7 HIV-1 infected subjects with substantial NCI without HIV encephalitis (HIVE), and D) 5 HIV-1 infected subjects with substantial NCI and HIVE.RNAs were extracted from the FC, WM, and BG.Clinical features of the NCI score, age, race, sample freeze time, and Antiretroviral (ARV) exposure were acquired from the National NeuroAIDS Tissue Consortium of USA (http://www.nntc.org/gene-arrayproject).
We selected groups B and C to find lncRNAs that differed between HIV+ patients without NCI and HAND patients with NCI without HIVE.To define the corresponding probe sets in the microarray that represented lncRNAs, we used an lncRNA annotation pipeline based on the method described by Feng et al. [52].First, we downloaded files as "RNA, long noncoding" through the HUGO Gene Nomenclature Committee [53] database (http://www.genenames.org/ at 2/16/2017) and then combined the file with Affymetrix probe ID by matching Entrez gene ID or Ensemble gene ID.Second, we downloaded the Affymetrix annotation file for the U133 2.0 Plus array (from http://www.affymetrix.com/at 2/17/2017) to identify probe sets mapped to lncRNA [54].After removing the duplicates, 3054 probe sets annotated to lncRNA transcripts were obtained (Supplementary Table 1).

Microarray data processing and screening of DE-lncRNAs
The Robust Multichip Average method in the R/ Bioconductor Affy package (http://www.bioconductor.org/; Affymetrix, Inc.) was used to pre-process the raw microarray CEL files by background adjustment, quintile normalization, and log-transformation [55].DE-lncRNAs in 3 regions (FC, WM, and BG) were identified by the limma package [56] in the R software.The Benjamini-Hochberg false discovery rate was applied to adjust the P values [57].The threshold was an absolute log2 fold change (FC) > 2 and P < 0.05 for DE-lncRNAs and P < 0.05 for differentially expressed genes used for WGCNA.Cluster analysis of DE-lncRNAs was also performed using the Pheatmap package of R [58].

Co-expression network construction
The DE-lncRNA to mRNA co-expression network was constructed using the WGCNA R package [15].The outlier samples were removed to ensure that the results of network construction were reliable.A suitable soft threshold power β was calculated and sorted out to approximate scale-free topology criteria.Then, the adjacencies between all included genes were calculated and transformed into a topological overlap matrix (TOM) [59], and the corresponding dissimilarity (1-TOM) was computed.Modules were identified as groups of highly interconnected genes using 1-TOM as the distance measure with a deep Split value of 2 and a minimum size cutoff of 30 by average linkage hierarchical clustering with a dynamic tree-cut algorithm to cluster dendrogram branches into gene modules [60].Highly similar modules were identified by clustering and merged with a height cut-off of 0.25.Each of the merged modules was assigned a unique color.

Finding modules of interest, functional annotation, and pathway enrichment analysis
Pearson's correlation test was used to calculate the correlation between modules and clinical features to explore biologically related modules.The modules in which expression was significantly related to NCI after adjustments for age, race, sample freeze time, and ADV exposure were selected as modules of interest to be further studied.DAVID (https://david-d.ncifcrf.gov/)offers functional annotation tools to define the biological processes of gene modules by using a hypergeometric distribution algorithm [61].To explore potential mechanisms of how module genes impact NCI, all genes of modules were mapped into DAVID v6.732 and subjected to GO biological process (BP) terms and KEGG pathway enrichment analysis [61] with P < 0.05 and the number of enriched genes ≥ 2.

Identifying hub genes and visualization
It has been suggested that the highly connected hub nodes are at the center of the constructed network [15] and centralized genes may play more important roles in cellular function compared with peripheral genes [62].We selected the 10% of nodes with the highest connectivity as hub genes from modules and used Cytoscape (version 3.2.1) to display selected modules or their hub nodes co-expression network when the module was huge.The associations between lncRNAs and mRNAs were also displayed and connected by solid lines to build the lncRNA-mRNA co-expression network.

Figure 1 :
Figure 1: Heatmaps of differential probes expression of microarray data annotated to lncRNAs in three brain regions for WM (A), BG (B) and FC (C).

Figure 2 :
Figure 2: DE-lncRNAs in the three brain regions for WM (A), BG (B) and FC (C) with Neurocognitive Impairment or not.Red denotes up-regulation, and green indicates down-regulation

Figure 3 :
Figure 3: Clustering dendrograms of genes.Gene clustering tree (dendrogram) obtained by hierarchical clustering of adjacency based dissimilarity.The colored row below the dendrogram indicates module membership identified by the dynamic tree cut method, together with assigned merged module colors and the original module colors.(A), 16 co-expressed modules ranging in size from 40 to 547 genes were identified in the WM.(B), In the BG, 8 co-expressed modules ranging in size from 85 to 932 genes were established.(C), In the FC, 8 co-expressed modules (ranging in size from 80 to 824 genes) were constructed.All the "grey" modules were reserved for genes identified as not co-expressed.

Figure 4 :
Figure 4: Module-feature associations.Each row corresponds to a module Eigengene and each column to a clinical feature.Each cell contains the corresponding correlation in the frst line and the P-value in the second line.The table is color-coded by correlation according to the color legend.(A), WM; (B), BG; (C), FC.

Figure 5 :
Figure 5: Barplot of representative GO biological process (BP) terms of 16 modules (except the saddlebrown module in WM without significant GO BP terms).Tree representative GO BP terms were chosen from top signifcant terms for each module.The y-axis depicts names of BP terms, and the x-axis depicts -log10 (P-value).Bar color denotes the module color.(A), WM; (B), BG; (C), FC.

Figure 6 :
Figure 6: Cytoscape network visualization of selected modules or their hub nodes co-expression network, (A) is a network of 85 genes in cyan module in BG region, two probes annotated to MEG3 and other 5 probes displayed the highest connectivity as hub genes with different colors.(B) the network of 51 hub genes in yellow module in WM region, CRHR1-IT1 was the only one of DE-lncRNAs of the hub genes with high connectivity.