High throughput sequencing identifies an imprinted gene, Grb10, associated with the pluripotency state in nuclear transfer embryonic stem cells

Somatic cell nuclear transfer and transcription factor mediated reprogramming are two widely used techniques for somatic cell reprogramming. Both fully reprogrammed nuclear transfer embryonic stem cells and induced pluripotent stem cells hold potential for regenerative medicine, and evaluation of the stem cell pluripotency state is crucial for these applications. Previous reports have shown that the Dlk1-Dio3 region is associated with pluripotency in induced pluripotent stem cells and the incomplete somatic cell reprogramming causes abnormally elevated levels of genomic 5-methylcytosine in induced pluripotent stem cells compared to nuclear transfer embryonic stem cells and embryonic stem cells. In this study, we compared pluripotency associated genes Rian and Gtl2 in the Dlk1-Dio3 region in exactly syngeneic nuclear transfer embryonic stem cells and induced pluripotent stem cells with same genomic insertion. We also assessed 5-methylcytosine and 5-hydroxymethylcytosine levels and performed high-throughput sequencing in these cells. Our results showed that Rian and Gtl2 in the Dlk1-Dio3 region related to pluripotency in induced pluripotent stem cells did not correlate with the genes in nuclear transfer embryonic stem cells, and no significant difference in 5-methylcytosine and 5-hydroxymethylcytosine levels were observed between fully and partially reprogrammed nuclear transfer embryonic stem cells and induced pluripotent stem cells. Through syngeneic comparison, our study identifies for the first time that Grb10 is associated with the pluripotency state in nuclear transfer embryonic stem cells.


INTRODUCTION
Reprogramming refers to the erasure and remodeling of epigenetic marks during mammalian development in vivo and is an approach that changes differentiated cells into dedifferentiated cells in vitro. Somatic cell nuclear transfer (SCNT) and transcription factor (TF) mediated reprogramming are two major in vitro reprogramming techniques. The studies of mammalian cloning and reprogramming have grown substantially since the first somatic cell cloned sheep, Dolly, was born [1]. The derivation of embryonic stem cells (ESCs) from cloned embryos by SCNT was an important achievement, and nuclear transfer ESCs (ntESCs) can be successfully derived from various adult cell types from mice, rhesus macaques, and humans, among others [2][3][4][5][6][7]. However, the reprogramming efficiency of SCNT limits the applications of ntESCs, although many solutions have been developed to resolve this issue. The addition of trichostatin A (TSA) and scriptaid (SCR) to the culture medium can improve SCNT efficiency [8][9][10][11]. Xistdeficient cumulus cells and Sertoli cells have been shown to robustly improve efficiency for mouse SCNT [12], and Kdm4A overexpression increased the blastocyst formation rate of human SCNT embryos [13].
Takahashi and Yamanaka demonstrated that pluripotent stem cells can be obtained from mouse embryonic or adult fibroblasts by introducing four transcription factors, Oct3/4, Sox2, c-Myc, and Klf4, under embryonic stem cell culture conditions [14]. Extensive studies examining TF mediated reprogramming were performed following the discovery that induced pluripotent stem cells (iPSCs) can support the full-term development of tetraploid blastocyst complemented embryos in mice [15,16]. Many studies have been performed to improve efficiency. Small molecules such as Vitamin C have been used to improve efficiency both in mouse and human TF mediated pluripotent reprogramming [17]. A recent study found that combining several small molecules can reprogram mouse somatic cells, increasing reprogramming efficiency to 0.2% [18]. Moreover, the expression of certain genes can improve the TF mediated reprogramming efficiency. Zscan4 overexpression increased iPSCs efficiency and quality in mice [18], whereas Nr5a2 can replace Oct4 during reprogramming and improve efficiency in mice [19].
To better understand SCNT and TF mediated reprogramming, the methylation state of imprinted mouse genes, epigenetic memory, somatic mutation and telomeric rejuvenation of ntESCs and iPSCs have been compared [20][21][22][23]. The DNA methylation and transcriptome profiles of human ntESCs corresponds closely to in vitro fertilized embryonic stem cells (IVF-ESCs), whereas iPSCs exhibits differences, retaining residual DNA methylation patterns typical of parental somatic cells [24]. Comparisons of ntESCs and iPSCs can be used to identify high-quality ntESCs or iPSCs for future regenerative medicine applications. Previous studies have shown that activation of the Dlk1-Dio3 imprinted genomic region is required for TF induced iPSCs to obtain full pluripotency and the expression of the imprinted genes Rian and Gtl2 was higher in fully reprogrammed iPSCs than in partially reprogrammed iPSCs [25,26]. However, it remains unclear whether the Dlk1-Dio3 region is also associated with ntESCs pluripotency state.
In this study, we first generated exactly syngeneic ntESCs and iPSCs from adipocyte progenitor cells (APCs) isolated from the all-iPSC mice through the primary TF mediated reprogramming in our previous study [15]. This secondary reprogramming system maintained the same genomic insertion in both ntESCs and iPSCs. By comparing fully and partially reprogrammed ntESCs and iPSCs, we observed that imprinted genes Rian and Gtl2 in the Dlk1-Dio3 region related to iPSCs pluripotency state were not correlated with the pluripotency state in ntESCs. A previous study has shown that incomplete somatic cell reprogramming caused abnormally high genomic 5-methylcytosine (5mC) levels in iPSCs compared to ntESCs and ESCs, suggesting that there might be different 5mC levels between ntESCs and iPSCs [27]. We did not observe a significant difference in 5mC or 5-hydroxymethylcytosine (5hmC) levels between fully and partially reprogrammed ntESCs and iPSCs. Our comparison of fully and partially reprogrammed ntESCs demonstrated that Grb10 was associated with the pluripotency state in ntESCs using high throughput sequencing, which was verified with quantitative reverse-transcription PCR in ntESCs from both APCs and fibroblast cells. By using syngeneic comparison, our study provides valuable information regarding ntESCs and iPSCs and identifies for the first time an important gene associated with the pluripotency state in ntESCs.

The derivation of ntESCs and iPSCs from APCs in a secondary reprogramming system
To perform an exact syngeneic comparison of ntESCs and iPSCs in this study, a secondary reprogramming system was established.
NtESCs were derived from the blastocysts of SCNT embryos. SCNT embryos were obtained by transferring the nuclei of APCs into enucleated oocytes (Table 1). SCNT blastocysts were plated onto a feeder layer of MEFs, and outgrowths emerged after approximately 5 to 10 days. In total, 38 ntESCs cell lines were established from 440 cloned embryos.
IPSCs induced from APCs were generated by adding doxycycline. After approximately 7 days, ESlike colonies emerged. In total, 45 iPSCs cell lines were established. Hereafter, we designate the ntESCs and iPSCs from APCs using different derivation methods as AN and AI, respectively. www.impactjournals.com/oncotarget

Characterization of different pluripotency states in ntESCs and iPSCs derived from APCs
To evaluate the pluripotency state in syngeneic ntESCs and iPSCs, we examined the karyotypes of the derived cell lines and identified 31 AN cell lines and 41 AI cell lines with normal karyotypes ( Table 2). There were only minor differences in the percentage of cell lines with a normal karyotype between the AN and AI cell lines.
Next, we observed that AN and AI cell lines exhibited a typical mouse ESCs morphology, with a compact appearance and a well-defined border. The cell lines were also positive for AP activity ( Figure 1A). AN and AI cell lines expressed both protein and mRNA for pluripotency marker genes ( Figure 1B and 1C).
Furthermore, we examined the developmental potential of the cell lines using teratoma formation as in vivo differentiation assay. Histological examination (H&E) revealed that AN and AI could give rise to teratomas containing tissues from all three germ layers ( Figure 1D and Supplementary Table 1). In addition, under in vitro differentiation conditions, differentiated cells derived from AN and AI cell lines exhibited upregulated markers for all three germ layers compared to undifferentiated cell lines ( Figure 1E and 1F).
To investigate the pluripotency state of AN and AI cell lines, we utilized a tetraploid complementation assay and performed germline transmission ( Figure 1G and Supplementary Table 1). We identified four fully pluripotent ntESCs (AN1, AN9, AN15 and AN20), five partially reprogrammed ntESCs (AN2, AN3, AN5, AN6 and AN7), three fully pluripotent iPSCs (AI3, AI7 and AI10) and one partially reprogrammed iPSCs (AI9). Hereafter, we designated the fully pluripotent ntESCs as AN F, the partially reprogrammed ntESCs as AN P, the new derived fully pluripotent iPSCs and previous identified fully pluripotent GS32 as AI F, and the new derived partially reprogrammed AI9 and previous identified partially GS4 and GS9 as AI P, respectively. These fully and partially pluripotent cell lines were utilized for further investigation.

Expression of Rian and Gtl2 in the Dlk1-Dio3 region in ntESCs and iPSCs
Previous studies have shown that the imprinted genes Rian and Gtl2, which are located in the Dlk1-Dio3 region, are associated with pluripotency state in mouse iPSCs [32,33]. The quantitative PCR results showed that the expression levels of Rian and Gtl2 were significantly different between AI F and AI P, which is consistent with previous reports showing that Rian and Gtl2 were expressed at a higher level in fully reprogrammed iPSCs than in partially reprogrammed iPSCs [25,26]. However, we found that there was no significant difference in the expression of Rian and Gtl2 between AN F and AN P ( Figure 2A and 2B), suggesting that pluripotency state related genes in the Dlk1-Dio3 region in iPSCs might not be associated with pluripotency in ntESCs. Rtl1 is another gene in the Dlk1-Dio3 region that showed little expression difference between AN F and AN P, but there was a significant difference in Rtl1 expression between AI F and AI P ( Figure 2C).

5mC and 5hmC DNA modifications in ntESCs and iPSCs were not different
Evaluation of the epigenetic modification in ESCs, ntESCs and iPSCs suggested that incomplete somatic cell reprogramming might be caused by abnormally high levels of genomic 5mC in iPSC lines, even though iPSCs had germ-line chimeric properties [27]. Here, we examined the 5mC and 5hmC levels in ntESCs and iPSCs and determined whether 5mC content correlated with fully or partially reprogrammed stem cells. Our results showed that there was no significant difference in 5mC or 5hmC levels in AN F and AN P cell lines, and AI F and AI P cell lines ( Figure 3A and Figure 3B).

The identification of Grb10 associated with pluripotency state in ntESCs
We used high throughput sequencing to compare gene expression differences between AN F and AN P, and AI F and AI P. In AN F, 84 genes were significantly upregulated,     and 391 genes were upregulated in AI F compared to AN P and AI P, respectively. Gene Ontology (GO) analysis showed that most of the upregulated genes in AN F were related to transcription ( Figure 4A). The upregulated genes in AI F were enriched for mesodermal developmentrelated processes including blood vessel, vasculature and skeletal development ( Figure 4B). Interestingly, we found that Grb10, a maternally imprinted gene, was upregulated in both the AI F and AN F cell lines ( Figure 4C). Our quantitative PCR analysis confirmed that Grb10 expression in fully reprogrammed AN cells was higher than in partially reprogrammed cells ( Figure 4D). In AI cell lines, Grb10 expression in most AI fully reprogrammed cell lines was higher than in partially reprogrammed cell lines except for AI9 ( Figure 4E). These results suggest that Grb10 might be associated with pluripotency state in ntESCs. To further verify that Grb10 was associated with pluripotency state in ntESCs, ntESCs cells (FN) derived from SCNT blastocysts using the tail-tip fibroblasts (TTFs) as donor cells were examined. The TTFs were from the 1 0 -all-iPSC mice in the primary TF induced pluripotent reprogramming in our previous study [15]. The 1 0 -MEF-iPSC-37 cells (37iPSC) were derived from 13.5 dpc embryos collected from female 129S2/Sv mice mated with Rosa26-M2rtTA transgenic mice and were shown to be fully pluripotency by their capacity to generate the all-iPSC mice. FN cell lines were also grouped into fully and partially reprogrammed cell lines using the tetraploid complementation assay and were designated FN F and FN P, respectively (unpublished data). This assay showed that Grb10 expression was significantly higher in FN F than in FN P ( Figure 4F), which indicates that Grb10 might work as an important molecular marker for indicating pluripotency state in ntESCs derived from different cell types.

DISCUSSION
In this study, we compared syngeneic ntESCs and iPSCs and showed that the expression of pluripotency associated genes in Dlk1-Dio3 region and 5mC/5hmC levels could not be used to evaluate fully and partial reprogrammed ntESCs, and demonstrated that Grb10 is associated with the pluripotency state in ntESCs.
Imprinted genes are expressed from a single parental allele and have parental-specific epigenetic modifications [34]. The imprinted genes Rian and Gtl2 are located in Dlk1-Dio3 region on distal mouse chromosome 12 [32,33]. A previous study indicated that Rian and Gtl2 are significantly downregulated transcripts in iPSCs when comparing genome wide expression between genetically identical mouse ESCs and iPSCs [25]. In our study, the expression of Rian, Gtl2 and Rtl1 was higher in AI F than in AI P and was associated with iPSCs quality, which was consistent with the previous studies [25,26]. It remains unclear that whether the expression of Rian, Gtl2 and Rtl1 are also associated with the pluripotency state in ntESCs. To evaluate the pluripotency state in ntESCs, we examined these associated genes in more AN cell lines in this study. There was little difference in the expression of Rian, Gtl2, and Rtl1 between AN F and AN P, which indicated that Rian, Gtl2 and Rtl1 expression from the Dlk1-Dio3 region is not useful to evaluate fully and partial reprogrammed ntESCs. A previous study indicated that TF stoichiometry influences the Dlk1-Dio3 locus and that some Gtl2-LOW (not OFF) iPSCs can support the iPSC full-term development [35]. Gtl2 might activate the Dlk1-Dio3 region and be used to assess the quality of reprogramming in iPSCs [36]. It showed that the expression of Gtl2 was associated to the quality of iPSCs. Our results showed that Gtl2 expression in AN cell lines was not associated with the pluripotency state in ntESCs and Gtl2 could not be used to assess the quality of reprogramming in ntESCs.
Epigenetic modifications include DNA methylation and histone modification, which play important roles in both SCNT reprogramming and TF induced pluripotent reprogramming. Cloning of mammals by SCNT results in gestational or neonatal failure with at most a few percent of manipulated embryos resulting in live births likely due to the inappropriate epigenetic reprogramming [35]. The identity of somatic cells is strictly protected by an epigenetic barrier, and these cells acquire pluripotency by breaking the epigenetic barrier by reprogramming factors such as Oct3/4, Sox2, Klf4, Myc and LIN28 [37,38]. A previous report indicates that a major reprogramming event during early embryonic development is the erasure and subsequent re-establishment of methylation patterns at 5mC [39]. 5hmC is an epigenetic modification that has been suggested to be associated with the pluripotency state during reprogramming of mouse fibroblasts into iPSCs [40]. Our previous results suggested that 5mC-to-5hmC conversion represents a crucial step in the initiation of epigenetic remodeling and transcriptome resetting to achieve pluripotency [41]. A previous report found that the genomic 5mC levels in iPSCs were higher than the levels in ntESCs [27]. Therefore, we compared the levels of 5mC and 5hmC in both ntESCs and iPSCs. However, we found no difference in 5mC and 5hmC levels between the AN F and AN P cell lines, or the AI F and AI P cell lines. This result was not consistent with previous study that reported ntESCs and iPSCs have different 5mC levels. In this study, we used nine ntESC cell lines and seven iPSC cell lines, which were syngeneic with same genomic insertion, to examine 5mC and 5hmC levels. These stem cells were well defined as fully and partially reprogrammed ntESCs and iPSCs by tetraploid complementation assay. We used more cell lines than the previous study, which used only two ntESCs and two germ-line chimeric iPSCs without further performing tetraploid complementation assay. Our results suggest that genomic 5mC levels might not be used to evaluate pluripotency state in AN and AI cell lines and that the dynamic conversion of 5mC to 5hmC might become stable after reprogramming is complete.
To evaluate the pluripotency state of AN cell lines, a comparison of fully and partially reprogrammed cell lines was performed using high throughput sequencing. To avoid the difference caused by RNA library construction, quality control, quantification and other sequencing processes, we performed high throughput sequencing in all AN and AI cell lines together, including previous identified AI cell lines. Interestingly, our result showed a significant difference between AN F and AN P in the expression levels of Grb10 ( Figure 4D), which suggested that Grb10 might be a marker for the pluripotency state in ntESCs. Grb10, which is also called Meg1, is an imprinted gene on mouse proximal chromosome 11 and a candidate gene causing Silver-Russell syndrome [42]. Previous reports suggested that aberrant function of Grb10 may contribute to disorders of proliferation, apoptosis, and metabolism, with specific emphasis on growth and neuronal development [43][44][45][46]. Recent study has indicated that Grb10 plays an inhibitory role for hematopoietic stem cell self-renewal and regeneration [47]. We examined Grb10 expression in another series of fully and partially reprogrammed ntESCs derived from fibroblasts and verified that Grb10 is associated with the pluripotency state in ntESCs in our study, although the mechanism for this association remains unclear.
In summary, our study performed an invaluable comparison of syngeneic ntESCs and iPSCs and identified for the first time that an imprinted gene, Grb10, is associated with the pluripotency state in ntESCs.

Mice and cell culture
All of the animal protocols and experiments were approved by the Animal Research Committee of the Institute of Zoology, Chinese Academy of Sciences, and are consistent with the National Institute of Biological Sciences guide for the care and use of laboratory animals.
FN cell lines were derived from SCNT blastocysts using 1 0 -TTFs as the donor cells.
A schematic of the cell lines derivation is shown in Supplementary Figure 1.

Karyotype analysis
The cells were incubated in ESCs medium with 0.25 μg/ml colcemid (Invitrogen, Thermo Fisher Scientific) for 2-3 h and harvested with 0.05% Trypsin-EDTA (Invitrogen, Thermo Fisher Scientific). After incubation in hypotonic solution containing 0.4% sodium citrate and 0.4% potassium chloride (1:1, v/v) at 37°C for 5 min, the cells were fixed with a methanol/acetic acid mixture (3:1, v/v). The fixed cells were mounted on glass slides and stained with Giemsa at 37°C for 10-15 min after drying. At least 20 metaphase chromosome karyoschisis were examined for each cell line.

Teratoma formation
AN and AI cells (2-5×10 6 ) were subcutaneously injected into the groin of severe combined immune deficiency (SCID) mice. Tumors were dissected and processed for hematoxylin-eosin staining 6-8 weeks after injection.

Embryoid body formation
AN and AI cells were trypsinized into a single cell suspension and transferred to Petri dishes in DMEM supplemented with 15% FBS without LIF. Three to seven days later, the embryoid bodies (EBs) were harvested and plated onto gelatin-coated tissue culture dishes for another 3-7 days. Total RNA from plated EBs was extracted and used for quantitative PCR. GAPDH was used as an endogenous control.

Quantitative reverse-transcription PCR
Total RNA was purified using TRIzol (Invitrogen, Thermo Fisher Scientific). RNA (2 μg) was reversetranscribed using M-MLV Reverse Transcriptase and RNasin RNase Inhibitor (Promega, Madison, WI, USA). Quantitative reverse-transcription PCR was performed using SYBR Premix Ex Taq (Takara, Kusatsu, Japan). The reactions were performed in triplicate on a 1/10 dilution of the cDNA obtained from above. Gene expression in each sample was normalized to GAPDH, and the relative quantification of expression was estimated using the comparative CT method. All of the primers used are listed in Supplementary Table 2.

Tetraploid complementation
To perform tetraploid complementation, B6D2F1 embryos at the 2-cell stage were electrofused to tetraploid embryos, 10-15 cells were injected into the reconstructed tetraploid blastocysts, and these were transplanted into the uteri of pseudo-pregnant mice. Caesarean sections were performed on day 19.5, and pups were fostered by lactating ICR mothers.

RNA-seq data analysis
Total RNA was extracted from the different cell lines using TRIzol (Invitrogen, Thermo Fisher Scientific). Libraries were constructed with the NEBNext DNA Library Prep Master Mix Set for Illumina (New England Biolabs, MA, USA), and PCR products were purified using AMPure XP beads (Beckman, Brea, CA, USA). The RNA library was quantified using Qubit 1.0 (Invitrogen, Thermo Fisher Scientific), analyzed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) for size distribution, and then sequenced with an Illumina Hiseq-2500 in single mode (1x50nt) by the Bioinformatics core facility at National institute of Biological Sciences, Beijing.
The 51 bp sequences called by the Illumina pipeline were mapped to the mouse genome (mm9) using Tophat (v2.1.0) for data analysis. Gene annotation and calculation of FPKM values was performed using Cufflinks (v2.2.1) with the GTF annotation file (mm9).
Gene expression differences were assessed by Cuffdiff with a false discovery rate correction for multiple testing. Genes with a p-value < 0.05 and q-value < 0.05 were considered differentially expressed.

Liquid chromatography-mass spectrometry (LC-MS/MS) Analysis of 5mC and 5hmC
The genomic DNA (5 μg) from different cell lines was analyzed by liquid chromatography-tandem mass spectrometry to assess the quantity of 5mC and 5hmC, as described in our previous study [41,49,50].

Statistics
Student's t tests were performed using SigmaStat 3.5 software for statistical comparisons. And one-way ANOVA were performed using SPSS Statistics 19 software.