Improving molecular diagnosis of Chinese patients with Charcot-Marie-Tooth by targeted next-generation sequencing and functional analysis

Charcot-Marie-Tooth (CMT) disease is the most common hereditary peripheral neuropathy. More than 50 causative genes have been identified. The lack of genotype-phenotype correlations in many CMT patients make it difficult to decide which genes are affected. Recently, targeted next-generation sequencing (NGS) has been introduced as an alternative approach for diagnosis of genetic disorders. Here, we applied targeted NGS in combination with PMP22 duplication/deletion analysis to screen causative genes in 22 Chinese CMT families. The novel variants detected by targeted NGS were then further studied in cultured cells. Of the 22 unrelated patients, 8 had PMP22 duplication. The targeted NGS revealed 10 possible pathogenic variants in 11 patients, including 7 previously reported variants and 3 novel heterozygous variants (GJB1: p.Y157H; MFN2: p.G127S; YARS: p.V293M). Further classification of the novel variants according to American College of Medical Genetics and Genomics (ACMG) standards and guidelines and functional analysis in cultured cells indicated that p.Y157H in GJB1 was pathogenic, p.G127S in MFN2 was likely pathogenic, while p.V293M in YARS was likely benign. Our results suggest the potential for targeted NGS to make a more rapid and precise diagnosis in CMT patients. Moreover, the functional analysis is required when the novel variants are indistinct.


INTRODUCTION
Charcot-Marie-Tooth (CMT) is the most common inherited neuromuscular disorder with an incidence of 40 individuals in every 100 000 inhabitants [1]. The classical symptoms include slowly progressive distal muscle weakness, muscle atrophy and sensory loss of the lower and then upper limbs. As the motor and sensory peripheral nerves are affected, it is also called hereditary sensory and motor neuropathy. On the basis of electrophysiological results, CMT has been traditionally subdivided into two main groups: demyelinating type (CMT1) and axonal type (CMT2).
Thus far, more than 50 disease-causing genes have been identified to be associated with CMT (http://www .molgen.ua.ac.be/CMTMutations/; http://neuromuscular .wustl.edu/) of which the duplication/deletion of PMP22 is the most common cause of CMT1 [2]. The traditional strategy for molecular diagnosis of CMT is based on the clinical phenotype, inheritance pattern and electrophysiological examination. However, CMT is a highly heterogeneous disorder [3]. It may be inherited in more than one model of inheritance, and a single gene can result in multiple clinical phenotypes. Moreover, there are still many unknown causative genes of CMT waiting to be discovered [4]. It is impractical for a lab to investigate all the possible genes using conventional Sanger sequencing, which is time-consuming and expensive. Consequently, introducing a more comprehensive approach for molecular diagnosis of CMT is important. Targeted next-generation sequencing (NGS), a high-throughput DNA sequencing technology that performs parallel sequencing the genomic regions of interested [5], makes it possible.
Recently, targeted NGS has been successfully performed in the inherited neurologic diseases diagnosis [6][7][8][9]. It is an efficient and cost-effective tool for achieving a genetic diagnosis for inherited peripheral neuropathies [10][11][12]. However, up till now, there is still no literature to evaluate the efficiency of targeted NGS in Chinese CMT patients. Here, we applied targeted NGS in combination with PMP22 duplication/deletion analysis in a cohort of 22 Chinese CMT families. The novel sequence variants identified by targeted NGS were classified according to the American College of Medical Genetics and Genomics (ACMG) standards and guidelines [13], and further verified by the functional analysis in cultured cells.

PMP22 duplication/deletion analysis
The multiplex ligation-dependent probe amplification (MLPA) analysis was performed in all patients before the examination of targeted NGS. It showed that 8 out of 22 unrelated CMT patients had PMP22 duplication (Supplemental Table 1). The rest of the CMT patients were further screened for mutations with targeted NGS.

Identification of variants by targeted NGS and sanger sequencing
Targeted NGS was performed in the remaining 14 patients. Our gene panel included 44 genes ( Table 1). The coverage of the fraction of target base can be found in Table 2. Over 99% of target bases had >10x coverage, 97.43% had >30x coverage and 94.24% had >50x coverage. The mean coverage of target bases ranged from 106.86 -1287.75.

Classification of novel variants and functional analysis
The three novel variants were absent in 1000Genomes and dbSNP database and were not present in 500 control subjects. The SIFT and PolyPhen-2 software programs were used to predict the functional disruption of proteins due to amino acid change.
The novel variant c.379G>A (p.G127S) in MFN2 affects the same amino acid as two previously reported mutations (p.G127D and p.G127V) [14,15]. This variant was further identified in the proband's affected mother and younger brother who presented with a similar phenotype, while her unaffected father did not carry the same variant ( Figure 1A and 1B). The amino acid change was predicted to be deleterious by SIFT (score: 0) and PolyPhen-2 (score: 1.0). Additionally, the variant site was conserved in different animal species ( Figure 1C). According to ACMG standards and guidelines, this variant was classified as a likely pathogenic variant.
The novel variant c.877G>A (p.V293M) in YARS ( Figure 1D and 1E) was predicted to be deleterious by the SIFT (score: 0.02), while benign in PolyPhen-2 (score: 0.049). The variant site was not conserved in different animal species ( Figure 1F). The segregation analysis was not available as the blood samples were unobtainable from other family members. In contrast to all previously identified YARS mutations [16], this novel variant was located at the anticodon recognition region of YARS protein.
The novel variant c.469T>C (p.Y157H) in GJB1 ( Figure 1G and 1H) affected the same amino acid as the previously reported mutation (p.Y157C) [17]. The variant site was conserved in different animal species ( Figure 1I) and was predicted to be deleterious by SIFT (score: 0) and PolyPhen-2 (score: 1.0). However, segregation analysis was not performed, as no other family members were available for further examination.
To determine the consequence of amino acid change in YARS or GJB1, functional analysis in cultured cells was further performed. The data revealed that HEK293 cells transfected with wild-type or mutant YARS had the comparable mRNA (Supplementary Figure 1) and protein level ( Figure 2A). Furthermore, the p.V293M variant did not change the protein's distribution in HeLa cells ( Figure  2B), HEK293 cells or SH-SY5Y neuroblastoma cell lines (data not shown). These studies suggested that the novel variant p.V293M in YARS was likely benign. With regard to GJB1, we found that the p.Y157H change did not affect the GJB1 mRNA (Supplementary Figure 1) or protein level ( Figure 2C). The fluorescence study revealed that HeLa cells transfected with EGFP-GJB1-Wt formed intracellular granules, whereas cells expressed EGFP-GJB1-Y157H had diffuse intracellular staining ( Figure 2D). These data indicated that p.Y157H variant affected intracellular distribution of GJB1, suggesting the pathogenicity of this novel variant.

Clinical features of CMT patients
The clinical features of recruited CMT patients were summarized in Supplemental Table 1. Among these patients, 17 were male and 5 were female, and ages ranged from 12 to 66. All these cases displayed a progressive phenotype. In the majority of CMT patients, the symptoms at disease onset were muscle weakness in the lower limbs. Motor deficits were more marked distally and more sever in the lower limbs.
The patient (case 2) carrying the likely pathogenic variant p.G127S in MFN2 was a 17-year-old female ( Table 3). The symptoms and signs were began at age of 6 with gait disturbance. At the age of 15, she underwent orthopedic surgery. The clinical examinations revealed reduced deep tendon reflex, muscle weakness in the distal limbs and muscle atrophy in both hands. Conduction velocity of the median nerve was reduced, with a moderate decrease in distal amplitudes.
The patient (case 7) carrying the likely benign variant p.V293M in YARS was a 43-year-old male ( Table  3). He displayed foot drop, distal muscle weakness and atrophy in the lower limbs for about 18 years. He did not feel numbness in the limbs. Neurological examinations revealed bilateral foot drop with pes cavus, muscle weakness and atrophy in the lower limbs, and areflexia in all limbs. He had horizontal nystagmus. The motor nerve conduction velocity and muscle compound action potential were markedly reduced in all limbs. Abbreviations: MIM = Mendelian Inheritance in Man. The patient (case 9) carrying the p.Y157H pathogenic variant in GJB1 was a 23-year-old female from a large family with several affected members ( Table 3). Onset of the disease was at age 21 years, with distal weakness in legs and steppage gait. Neurological examinations revealed pes cavus, reduced muscle strength in the distal lower limbs and areflexia in all limbs. Electrophysiological studies showed reduced nerve conduction and compound muscle action potential in the peroneal nerve.

DISCUSSION
There are a number of different NGS technologies, including whole genome sequencing (WGS), whole   [18][19][20]. While it seems that WGS and WES are more attractive to identify underlying genetic causes than targeted NGS, one of the drawbacks of WES and WGS is that they generate a huge amount of unnecessary data. It is difficult and time-consuming to interpret those variants, especially when only one patient is available for sequencing. For targeted NGS, the panel covering a determined set of candidate genes can be designed to screen causative mutations. It offers some advantages owing to cost savings and the speed of data interpretation [5,21]. Therefore, in the current study, we introduced targeted NGS to detect the causative genes in Chinese CMT patients. Since targeted NGS was relatively less effective to detect the copy number variants [12,22], the PMP22 duplication/deletion was firstly analyzed. Thus far, targeted NGS approach has been applied to CMT patients [11,12,18]. Just like these studies, our results demonstrated that targeted NGS is an effective method to make a molecular diagnosis in CMT patients. However, to compare with those studies, there are some differences in our study. Firstly, apart from the PMP22 duplication/deletion analysis, the other common causative genes, such as GJB1 and MFN2, had not been detected before the examination of targeted NGS. The published studies have revealed that over 90% of the CMT patients carried mutations in PMP22, MPZ, GJB1 and MFN2 [4,23]. In our study, 8 cases had mutations in GJB1 and MFN2. Secondly, a specific dominant gene panel was designed in our study. Therefore, we recruited CMT families with dominant inheritance pattern. For targeted NGS, the most significant difficulty is how to analyze and interpret the possible pathogenic variants. To overcome this obstacle, a more logical approach is to design a smaller panel that covers major subtypes. The disease subtypespecific NGS panel costs less than the complete CMT gene panel, and the number of irrelevant variants is reduced [5]. In our study, variants identified in the dominant CMT gene panel were fewer than the complete CMT gene panel [11]. Lastly, we made a molecular diagnosis in 10 patients using targeted NGS (case 7 carrying a likely benign variant). The diagnostic success rate is higher than the previously reported in the literatures [10][11][12].
After being interpreted, 10 possible pathogenic variants were found in our study. The clinical characteristics of these CMT patients with known variants are comparable to the reported data [15,[24][25][26][27][28]. In case MFN2 is localized in the outer mitochondrial membrane which contains two hydrophobic heptad repeat domains and a GTPase domain. The novel variant was located at the GTPase domain. An intact GTPase domain is indispensable for the function of mitochondrial fusion [29]. In case 9, a novel pathogenic variant in GJB1 was identified. Functional analysis revealed that wildtype GJB1 formed puncta staining in the mammalian cells, while this phenomenon was disappeared when the cells expressed GJB1 mutant (p.Y157H). The result was consistent with the previously reported GJB1 mutants (p.M34K, p.N205I and p.Y211X). Many GJB1 mutants showed abnormal trafficking when expressed in mammalian cells [30].
In case 7, the novel variant in YARS was detected. YARS, encoding tyrosyl-tRNA synthetase (TyrRS), is an aminoacyl-tRNA synthetase involved in dominant-intermediate CMT [16,31]. TyrRS contains three functional domains: an N-terminal catalytic domain, a central anticodon recognition domain and a C-terminal EMAP II-like domain. So far, all previously identified pathogenic TyrRS mutations, including p.G41R, p.E196K and p.153-156delVKQV, were all located at the catalytic domain of the protein. These mutations affect the protein's localization in sprouting neurites of neuroblastoma cells [16]. A missense p.K265N variant located at the anticodon recognition domain of TyrRS has been reported. However, it was demonstrated to be a benign polymorphism [32]. The novel variant we found was also located at the anticodon recognition domain of TyrRS. Further functional analysis revealed that this variant did not change the protein's expression or intracellular localization. Therefore, this novel variant in YARS was classified as likely benign. Even after targeted NGS, 4 cases (including case 7) still did not have a molecular diagnosis. There may be several reasons for this. First of all, as the filtering method was too strict, it's possible that some truly pathogenic variants were recognized as benign. Secondly, in addition to the copy number variations (CNVs) of PMP22, we did not detect any further CNVs in other causative genes. Although CNVs outside of PMP22 locus are rare, it is still important to test them in CMT cases negative for the PMP22 duplication [33][34][35]. Finally, there are still many other CMT causative-genes that remain to be identified. It has been estimated that 50% of CMT patients may carry mutations in unknown disease genes [4].
In summary, we have combined targeted NGS and PMP22 duplication/deletion analysis as a diagnostic strategy for CMT patients. Besides 7 previously reported variants, we found 3 novel variants after targeted NGS analysis. Further classification of sequence variants according to ACMG standards and guidelines and functional analysis in cultured cells indicated that p.Y157H in GJB1 was pathogenic, p.G127S in MFN2 was likely pathogenic, while p.V293M in YARS was likely benign. Although the number of cases was small, our results still revealed an increase in diagnostic success rate.

Patients
Patients were recruited from the Department of Neurology, Huashan Hospital, and Second Affiliated Hospital of Zhejiang University School of Medicine from December 2007 to June 2015. Patients were considered to suffer from CMT if they had a sensorimotor peripheral neuropathy (according to their medical history, neurological examination and neurophysiological testing) and a family history of similar characteristics. The clinical diagnostic strategy was performed as described in the previously reported literature [23]. In total, 22 Chinese CMT families with dominant inheritance pattern were enrolled. Clinical evaluations were carried out by at least two senior neurologists. Routine blood biochemical tests and electrophysiological tests were performed. Five hundred unrelated aged individuals (≥65 years) without history of CMT were selected as a control group. Written informed consents were obtained from all the participants. This study was approved by the ethics board of Huashan Hospital and Second Affiliated Hospital.

PMP22 duplication/deletion analysis
All the CMT patients were tested for the PMP22 duplication/deletion using MLPA as reported previously [36]. MLPA was performed using the MLPA kit (MRC Holland, Netherlands), according to the manufacture's protocol.

Targeted NGS
Genomic DNA was extracted from peripheral EDTA-treated blood using Blood Genomic Extraction Kit (Qiagen, Germany). A dominant gene panel was designed to cover 44 genes known to be associated with dominant CMT and other inherited peripheral neuropathies (Table  1). Genetic features can be found in the online databases including Neuromuscular Disease Center (http:// neuromuscular.wustl.edu/time/hmsn.html) and Inherited Peripheral Neuropathies Mutation Database (http:// www.molgen.ua.ac.be/cmtmutations/). All of the exons and the 20 flanking base pairs of the splice junctions surrounding the exons of targeted genes were included. The samples were captured by NimbleGen SeqCap EZ products (Roche, Switzerland). Deep sequencing was further done on an Illumina HiSeq2000 platform (Genergy Biotechnology Co Ltd, Shanghai, China). Each read was aligned to the hg19 reference genome (http://hgdownload. cse.ucsc.edu/) using the Burrows-Wheeler Aligner (BWA, version 0.7.12-r1039) [37]. The variants were detected using Genome Analysis Toolkit (GATK, version 3.1-1-g07a4bf8) following the GATK best practices [38]. All the variants were annotated by the ANNOVAR (Version 2014-11-12). Variants were further filtered, as described in our previously publication [39]. Two software programs, SIFT (http://sift.jcvi.org/) and PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), were used to predict the possible protein functional change caused by the variant.

Sanger sequencing
Sanger sequencing was used to validate all the potential variants using standard protocols. PCR was performed to amplify the fragments covering the variant sites. The PCR products were purified and further directly sequenced on ABI 3730 DNA Sequencer. The sequencing results were aligned to human reference genome published in Ensembl (http://www.ensembl.org/).

Functional studies in cultured cells
The cDNA encoding human GJB1 was cloned into HindIII/KpnI site of pEGFP-C2 vector. The cDNA encoding human YARS was cloned into HindIII/KpnI site of pFLAG-CMV-4 vector and pEGFP-C2 vector. All mutant constructs of GJB1 and YARS were created by PCR mutagenesis and verified by Sanger sequencing.
HEK293 and HeLa cells were cultured in DMEM supplemented 10% fetal bovine serum in a 5% CO 2 incubator. To explore whether the novel variant affected mRNA and protein expression, HEK293 cells were transfected with wild-type, mutant expressing or empty vectors. Transient transfection was performed using the Lipofectamine 2000 according to the manufacture's protocol (Invitrogen, USA). Forty-eight hours after transfection, cells were lysed and harvested. The mRNA expression level of YARS and GJB1 was detected using RT-PCR analysis. The following primers were used: GJB1 Forward: GCGTGAACCGGCATTCTACT; GJB1 Reverse: TTGGTCATAGCAAACGCTGTT; YARS Forward: CTGCACCTTATCACCCGGAAC; YARS Reverse: TCCGCAAACAGAATTGTTACCT; GAPDH Forward: ACTCCACGACGTACTCAG; GAPDH Reverse: CATGTTCCAATATGATTCCACC. The protein samples were resolved by SDS-PAGE, transferred to nitrocellulose membrane and blotted with the desired antibodies. The antibodies against Flag (1:5 000; Abmart, China), GFP (1:5 000; Santa Cruz, USA) and β-actin (1:5 000; Sigma-Aldrich, USA) were used.
To further define the intracellular distribution of wild-type and mutant protein in mammalian cell lines, HeLa cells transiently over-expressing EGFP-YARS-Wt, EGFP-YARS-V293M, EGFP-GJB1-Wt, EGFP-GJB1-Y157H or the empty vector were directly observed under a confocal microscope (Leica, Germany).