Exome sequencing analysis of murine medulloblastoma models identifies WDR11 as a potential tumor suppressor in Group 3 tumors

Mouse models of human cancers are widely used in cancer research, yet questions frequently arise regarding their faithfulness in recapitulating their human counterparts. To compare the somatic mutations of murine models with human medulloblastoma (MB), we performed whole-exome sequencing on 12 tumors representing three distinct medulloblastoma subgroups: Wnt, Sonic Hedgehog (Shh) and Group 3 (G3). In total, 64 somatic mutations were identified and validated, including 40 predicted to cause amino acid changes. After filtering and cross-species analysis with 366 human MBs from four independent studies, human orthologs for 16 of the 40 mouse genes were found to harbor non-silent mutations in human MB. Loss-of-function Kmt2d mutations detected in one mouse tumor was previously reported in 30 of 366 human MBs. In mice bearing G3 MB, one mouse succumbed to tumor burden at least 15 days earlier than other mice, raising the possibility that somatic mutations may have accelerated the tumorigenesis process. In this mouse tumor, four novel candidate genes harbored non-silent somatic mutations, Lrfn2, Smyd1, Ubn2 and Wdr11. Extended survival was found in mice harboring mouse G3 overexpressing WDR11 but not the other three genes. Genes in the KEGG WNT signaling pathway, including Ccnd1/2/3, Myc and Tcf7l1, were down-regulated in the transcriptome of G3 MB tumorspheres overexpressing WDR11, consistent with reduced tumor progression. In conclusion, we demonstrated that common spontaneous mutations were shared between human and murine models of MB suggesting similar molecular mechanisms of tumorigenesis, and identified WDR11 as a protein with tumor suppressive activity in G3 MB.


INTRODUCTION
Genetically engineered mouse models (GEMM) provide a powerful in vivo system to test the role of genes mutated or overexpressed in human tumors [1]. Typically by overexpressing an oncogene or inactivating a tumor suppressor gene, normal murine cells can be transformed with histopathology similar to human cancers [2]. Despite Priority Research Paper the myriad of successful examples, mouse models have been frequently scrutinized with regards to their validity [3]. One key question is whether murine cancers and their human counterparts share similar tumorigenic processes which can be evaluated by comparing somatic mutational landscapes. Indeed, certain GEMMs have been found to harbor mutations in genes also mutated in human cancers [4,5]. Next-generation sequencing (NGS) can comprehensively characterize the mutation landscape of cancers. However, NGS-based somatic mutation analysis in mice can be challenging due to complex mouse genetic background, especially when the matched normal DNA is not available [4].
Medulloblastoma (MB) is the most common malignant pediatric brain tumor that arises within the posterior fossa [6,7]. Expression profiling of human MBs subdivides this cancer into four major molecularly distinct subgroups: Wingless (WNT), Sonic Hedgehog (SHH), Group 3 (G3) and Group 4 G4 [8][9][10][11][12]. Genetically engineered MB mouse models have been developed for Wnt [13], Shh [14][15][16][17][18] and G3 [19][20][21][22]. These subgroupspecific mouse models recapitulate the histopathology and gene expression profiles of their human counterparts [23] and can be utilized for screening and preclinical testing of therapeutics [24], which have been shown to increase the cure-rate of MB and quality of life of surviving patients [25][26][27]. However, very little is known about the role of additional somatic mutations involved in the process of tumorigenesis. To evaluate this in MB, we performed whole-exome sequencing (WES) to identify somatic mutations in 12 mouse tumors from 3 MB subgroup-specific murine models established at St. Jude Children's Research Hospital: five Shh (MBS) [17], three Wnt (MBW) [13] and four G3 (MBM) [19]. A total of 64 somatic mutations were identified and experimentally validated, including 40 non-silent mutations predicted to cause amino acid changes. Survival data were available for the MB subgroup G3 mouse model -one mouse had a much shortened life than the other mice in the cohort. This mouse tumor had four genes with non-silent mutations, Lrfn2, Smyd1, Ubn2 and Wdr11. We hypothesized that the mutations to these four genes may have facilitated tumor growth, and evaluated the function of each of these genes on G3 MB progression using in vivo systems.

Whole exome sequencing of tumor samples identified putative somatic mutations
The murine MB models of three different subgroups were established as described previously [13,14,19,28]. After whole-exome sequencing (WES), the reads were mapped to mouse genome (mm9) with at least 90% mapping rates across all samples (Supplementary Table  1). Most (11 of 12) samples had at least 80% coding bases covered by a minimum of 20x coverage. We did not retrospectively bank germline DNA needed for somatic mutation calling. To overcome this challenge, we went through extensive germline polymorphisms filtering process that removed both common mouse polymorphisms and rare variants found in other raw sequencing data of common laboratory mouse strains. The entire process of single nucleotide variations (SNVs) identification, filtering and validation is illustrated in Figure 1a. Specifically, the initial SNV calling was performed using Bambino [29]  by comparing each tumor with publicly available normal sequences of C57BL/6 mouse strain [30]. With filtering and manual curation, 213 putative mutations remained. Amplicon sequencing by Sanger or MiSeq using tumor specimen and all available mice of common ancestors as control validated 62 somatic SNVs and two indels (Supplementary Table 2).
The overall mutation burden, 64 mutations in 12 mouse tumors or 5.3 mutations per tumor, is lower than previously reported for human MBs [8]. The number of somatic mutations per tumor ranged from 0 to 31, where the Wnt subgroup had higher burden than the Shh (p-value = 0.03) and G3 (p-value = 0.10) subgroups (Figure 1b). The final missense-to-silent mutation ratio was 1.7, and 40 mutations were predicted to cause changes in amino acid sequence. Due to the limited numbers of mice (3-5 per subgroup), no gene was recurrently mutated.

Common non-synonymous mutations shared by MB mouse models and human MBs
To identify common mutations shared by mouse and human MBs, we compared the mutations identified from 12 mouse MBs to four independent human MB studies, consisting of a total of 366 MBs [8,10,11,31]. Of the 40 mouse genes that carry amino-acid-changing mutations, 16 genes' human orthologs were found to be mutated in human MBs. In seven genes, the exact type of mutation was found in the same subgroup of MB, including BAI3, DOCK7, LRFN2, MLL2, MLL3, PIWIL4 and WRD11 (Table 1). The most common mutations occurred in well-known MB genes KMT2D and KMT2C which contained 30 and 10 mutations in human MBs, respectively (Table 1). Remarkably, Kmt2d harbored a nonsense SNV predicted to cause loss of the SET domain which confers the methyltransferase activity [32,33]. For comparison, the 366 human MBs harbored 30 KMT2D mutations, including 21 loss-of-function mutations (nonsense SNVs or frameshift indels) and nine missense mutations confirming the validity of the mouse models of MB. Specifically, 4 out of 48 KMT2D mutations occurred in WNT human MB subgroup, whereas one out of three WNT mice has a Kmt2d mutation (p = 0.015, hypergeometric test).

Potential secondary driver mutation suggested by survival data
We addressed the overall survival of mice with G3 MBs (Supplementary Table 3). Among four mice in this group, MBM003, which had the highest mutation burden (Figure 1b), succumbed to tumor-burden at least 15 days  earlier than the other mice. This led to the hypothesis that this mouse may harbor deleterious mutations which accelerated tumor progression. There were four non-silent mutations in this tumor, all missense SNVs: Lrfn2 R656H, Smyd1 R237Q, Ubn2 Q549K and Wdr11 I46T.
To test the potential tumor suppressor function of the four candidate genes, we used lentiviral transduction to enforce the expression of each of these wild type genes or an empty vector as control, in the mouse G3 tumorsphere line # 19568 prior to implantation of 100,000 unsorted tumor cells into the cortices of CD1 nu/nu mice ( Figure 2). Enforced expression of LRFN2 did not alter the survival of tumor-bearing mice ( tumor-bearing mice. Analysis of these tumors showed increased WDR11 transcript levels ( Figure 2b, lower right). Protein expression was observed in tumorspheres from secondary tumors, measured by immunofluorescence ( Figure 2c). These data suggested that in murine G3 MB, WDR11 had tumor suppressive activity since its enforced expression delayed tumor progression. To confirm these results, we over-expressed WDR11 in two additional mouse G3 MB tumorsphere lines, # 9728 and # 19251. In these tumorsphere lines, over-expression of WDR11 gene ( Figure 3b) and protein (Figure 3c) also resulted in a significant increase in survival of WDR11-expressing tumor-bearing mice compared to controls (p < 0.05) ( Figure 3a). These data identify WDR11 or its downstream effectors as potential targets for therapy in G3 MB.

WDR11 overexpression leads to down-regulation of the WNT signaling pathway
To further understand the underlying mechanism by which WDR11 overexpression contributed to the increased survival, we performed the whole transcriptome analysis

DISCUSSION
In the past few years, human MBs have been extensively characterized at the molecular level by whole genome sequencing, exome sequencing, gene expression profiling and DNA methylation array [8,10,11,31,34]. These studies highlighted the paucity of somatic mutations found in MB and the complete absence of mutations in a large proportion of tumors, particularly in G3 and G4 MBs that represent the majority of tumors. The mutation burden from the current MB mouse models was even lower than that reported in human MBs. A possible explanation for the lower mutation burden is that mouse tumors are established in a Trp53-null background and either MYC overexpression in G3 MBs, PTCH1 mutation in SHH MBs or B-CATENIN mutation in WNT MBs, all of which are found in corresponding human MB subgroups and sufficient to induce tumor development thus negating the requirement for additional driver mutations. Nevertheless, spontaneous mouse models of WNT and SHH and orthotopic models of G3 MB take 2-3 months to develop suggesting that additional mutations might be required to induce a full blown tumor.
One further question is whether the identified secondary mutations in mouse MBs match to the human MBs with similar background? In this study, we identified one Kmt2d nonsense mutation in the WNT subgroup.
Remarkably, it has been reported that human WNT MBs also frequently harbor KMT2D nonsense or frameshift mutations, accounting for 16% of MB patients [30]. However, due to the limited sample size, we didn't observe any shared mutations between human and murine SHH MBs. We previously analyzed and published chromosomal anomalies in orthotopic mouse SHH medulloblastoma derived from primary granule progenitors and found similar genomic alterations between mouse and human SHH MBs suggesting that these mouse models accurately model the human tumors [17,35]. In future studies with additional samples and different types of data such as copy number, it will be interesting to evaluate other hallmark mutations.
Among the mutations identified in the mouse models of MB, epigenetic regulators appeared enriched compare to mutations in other biological pathways. In addition to Kmt2d, another mutated epigenetic histone modification gene was Smyd1, a histone methyltransferase that contains two known functional domains: MYND domain (myeloid translocation protein 8, Nervy, DEAF1) responsible for histone deacetylase-dependent transcriptional repression, and a SET domain [Su(var)3-9,enhancer-ofzeste,trithorax] that confers histone methyltransferase activity [32,33]. Smyd1 has been reported to regulate endothelial cells [36] and skeletal muscle [37] and to be mutated in an indolent B cell non-Hodgkin lymphoma [38] but not in brain tumors or in cerebellar development. This suggests that either Smyd1 has a redundant function with other histone methyltransferases that compensate for its loss of function, or that this mutation is a passenger rather than a driver event. Indeed, in human MBs, inactivating mutations have been observed for KTM2B, KTM2C, KTM2D and SETD2, but not for other histone methyltransferases including SMYD1 [8,10,11,31,34].
Interestingly, the murine G3 MB carrying a mutant Smyd1 had a shortened life-span (Supplementary Table  3). In addition to Smyd1 mutation, this MB also carries mutation in three other genes (Lrfn2, Ubn2 and Wdr11). These genes belong to families that were previously found to be involved in neuronal development or chromatinremodeling [39][40][41][42]. Since these mutations were expected to be deleterious, this suggested that they may have tumor suppressor activity and that enforced expression of the wild type gene would suppress G3 MB progression. Indeed, enforced expression of wild-type WDR11 (but not LRFN2, Ubn2, Smyd1) hindered G3 MB development, which is in-line with our hypothesis about its tumor suppressor activity. Over-expression of WDR11 resulted in the down-regulation of WNT signaling pathway and regulators of G1 progression including Myc and D-type cyclins, Ccnd1,2 and 3 consistent with the delay in tumor development and increased survival.
WDR11, previously named BRWD2 and then revised to WDR11 due to the lack of bromodomains, is co-localized with FGFR2 in tail-to-tail manner on chromosome 10q26 [43]. WDR11 is a member of the WDrepeat gene family and ubiquitously expressed in normal brain and glial tumors [43]. The WD-repeat superfamily of proteins are found in all eukaryotes and implicated in a wide variety of functions including apoptosis, its loss of function would increase proliferation [44,45]. While the exact function of WDR11 function is unknown, the proteins in that family have been found to be involved in many biological processes. Structural analysis revealed 12 WD domains in WDR11, nine of them form two β propellers. WDR11 co-localizes and interacts with EMX1, a homeodomain transcription factor that participates in the development of the central nervous system during early development of the brain [46], and missense mutations disrupting EMX1-binding in WDR have led to Kallman Syndrome, a genetic condition causing puberty failure [39]. The MB mouse mutation is a missense SNV that changes codon 46 from isoleucine to threonine. The site is evolutionarily conservative across mammals. WDR11 was previously found to be mutated in one human MB [11] (Table 1). Another study also found WDR11 to be a tumor suppressor in glioblastoma (GBM) in which WRD11 located on chromosome 10q26 [43], is inactivated in a balanced reciprocal translocation t(10;19) in a region that show frequent loss of heterozygocity (LOH) in GBMs. As the cost decreases, sequencing genetically engineered mouse tumors is becoming an effective approach to identify additionally acquired driver mutations, which might be shared by human and mouse tumors and help unveil the common molecular mechanisms of cancer [5]. However, somatic mutation calling remains a challenge, especially when the matched normal DNA is unavailable. Even for inbred CD1 nu/nu mouse strains, the real genetic background are often found to be surprisingly complex [4], which could not be resolved by simply filtering out common SNPs. In the current analysis, we developed a robust germline filtering pipeline using publicly available whole-genome sequencing data of 18 common laboratory strains. In addition to filtering out SNPs identified from these strains, we also checked the raw sequencing data for the weak evidence of any mutation calls being present, which helped remove germline SNPs in difficult regions such as the ones with low coverage or poor quality. Our extensive germline filtering process allowed us to reduce the mutation list to a manageable size with good validation rate. However, it is also possible that real mutations might have been accidentally excluded due to potential over-filtering. In the future, it is still preferred to have a matched normal sample available for somatic mutation calling.
In conclusion, our experiments highlight the differences in number and identify of the somatic mutations found between human and mouse tumors despite their similarity in pathology and gene expression pattern. We also identified Wdr11 as a putative tumor suppressor in Group3 MB. www.impactjournals.com/oncotarget Mouse G3 MBs were generated by orthotopic transplantation of granule neuron progenitors (GNPs) purified from cerebella of 7 days old, P7, Trp53-/-; Cdkn2c-/-mice infected with Myc-encoding retroviruses into the cortices of 6-8-week-old CD-1 nu/nu mice (Charles River Laboratories), as previously described [19]. Mice were housed in an accredited facility of the Association for Assessment of Laboratory Animal Care in accordance with the NIH guidelines. The Institutional Animal Care and Use Committee of SJCRH approved all procedures in this study.

Exome-capture and illumina sequencing
Genomic DNA was extracted from mouse Shh and Wnt spontaneously occurring tumors and G3 orthotopic MBs using the Qiagen DNAeasy kit. DNA samples were submitted to the Pediatric Cancer Genome Project Validation Lab for exome sequencing. Paired end sequencing reads have been mapped to mouse reference genome mm9 assembly. The mapping statistics was obtained from 'samtools' and the coverage statistics was obtained via an in-house coverage analysis pipeline (covsum) based on the RefSeq annotation of gene coding region.

Identification of somatic mutations in mouse and mapping to human
Putative sequence variants including SNVs and indels were initially detected by running paired analyses using variation detection module of Bambino [29] with the following parameters: -min-flanking-quality 15 -min-altallele-count 2 -min-minor-frequency 0 -broad-min-quality 10 -mmf-max-hq-mismatches 15 -mmf-min-quality 15 -mmf-max-any-mismatches 20 -unique-filter-coverage 2 -min-mapq 1. A putative somatic sequence mutation was collected based on the following criteria: (1) Fisher's exact test P value indicates that the number of reads harbouring the non-reference allele is significantly higher in tumour; (2) the non-reference allele frequency in tumour is > = 10%; Substitution variants are classified into four categories based on combination of their P value and sequence quality scores: High quality, high P value; high quality, low P value; low quality, high P value; low quality, low P value. P value refers to the P value of Fisher's exact test comparing the distribution of the alternative allele in tumour and normal. High P value, P < 0.05; low P value, 0.05 < P < 0.10. A final review process re-maps and realigns the reads harbouring the non-reference allele to the reference genome to filter potential false positive calls introduced by mapping in repetitive regions and alignment artefacts. For putative somatic indels, the review process re-aligns all reads in tumour and normal at the indel site to a mutant allele template sequence constructed by substituting the wild-type allele with the indel. Presence of reads in normal sample that cover the mutant allele is considered a germline variant. Mouse gene symbols were converted to human ortholog gene symbols by using table "Orthology -Human vs. Mouse" from Mouse Genome Informatics (MGI), The Jackson Laboratory [47]. Gene symbols were mapped to geneID and RefSeq accession using "gene_info" and "gene2refseq" downloaded from NCBI (downloaded 04/29/2011). Afterwards, each mouse mutant variant was mapped to its human ortholog mRNA by running a local Blast to determine corresponding human mutation location.
To filter out germline SNPs, we first excluded any SNPs in public mouse germline SNP databases including: dbSNP build 128 [48], 18 common laboratory strains SNPs by The Wellcome Trust Sanger Institute [30], and Center for Genome Dynamics Mouse SNP Database from The Jackson Laboratory [49]. Afterwards, the remaining SNV calls were directly compared with the normal BAM files of the eighteen common laboratory strains by the Sanger Institute [30]. Putative mutations present in any of the eighteen strains were considered as non-somatic and excluded from further analyses. Remaining predictions were manually reviewed. Small insertions and deletions (indels) were identified in a similar manner. Overall, a total of 213 putative mutations, consisting of 207 SNVs and six Indels, were predicted and were sent for experimental validation.

Validating the somatic mutations with non-tumor mice
Putative somatic SNVs and indels went through experimental validation using either amplicon-based MiSeq or Sanger sequencing (Figure 1a). Since the matched germline DNA was not available, we used all available mice of common ancestors for the validation experiment. Mutations present in any of these control mice were considered as non-somatic. After excluding one uncovered mutation in the validation experiment, the validation rate for the remaining 212 predictions was 30% (64/212). The highest cause of false predictions was germline events, which accounted for 41% (88/212). After all, we identified and validated 62 somatic SNVs and two indels (Supplementary Table 2). www.impactjournals.com/oncotarget

Tumor sphere cultures, lentivirus infection, orthotopic transplants and tumor harvest
Mouse G3 MBs were grown as tumorspheres as previously described [19]. G3 MB tumorspheres # 19568 and #19251 were derived from tumors that developed in CD1nu/nu mice implanted with cerebellar granule neuronal progenitors transduced with MSCV retroviruses expressing Myc and the green fluorescent protein (GFP) from an internal ribosomal entry site (IRES) (Myc-IRES-GFP). G3 MB tumorsphere # 9728 was derived from tumors explanted from mice implanted with cerebellar granule neuronal progenitors transduced with MSCV retroviruses expressing Myc and the cyan fluorescent protein (CFP) from an internal ribosomal entry site (IRES) (Myc-IRES-CFP). Tumorspheres were derived from each of the three tumors and grown in N2, B27, EGF and FGF supplemented neurobasal medium ("complete tumorsphere medium"0, as previously described [19]. Tumorspheres were dissociated and infected six times over a 2-day period in virus collection media supplemented with N2, B27, EGF and FGF. The following day, infected tumorspheres were suspended in Matrigel (BD Bioscience, San Jose) at a concentration of 100,000 tumorspheres/5µl Matrigel and transplanted into the cortices of CD1-nu/ nu mice, as described previously [19,51]. Infection efficiency was analyzed using flow cytometry for RFP or GFP expression. In all cases, lentiviral infection efficiency ranged from 40-80%. After transplant of virus-infected tumorspheres, mice were examined daily for symptoms of sickness (doming of the head, ataxia or reduced activity). When mice became moribund, tumors were isolated and grown on a coverslip embedded in a 1:2 ratio (Matrigel: complete tumor sphere media) for 48 hours to allow for sphere formation and fixed in 4% paraformaldehyde for 15 minutes at room temperature followed by 3 rinses in PBS prior to immunofluorescence staining. Tumor chunks or cell pellets were also collected and stored at -80 • C.

Transcriptome sequencing and analysis
Total stranded RNA sequencing data were generated and mapped against mouse genome assembly NCBIM37.67 using the StrongArm pipeline described previously. Four murine G3 tumor samples overexpressing human WDR11 and three mouse G3 MB tumors expressing the empty vector control were characterized. The gene level quantification values were obtained with HT-seq [52] based on GENCODE annotation and normalized by TMM method with 'EdgeR' package [53]. Differential expression analysis was performed in 'voom' method in R 'limma' package [54]. Gene set enrichment analysis was carried out using GSEA with MSigDB [55].

Statistical analysis
The Kaplan-Meier method was used to calculate the significance of mouse survival. Statistical analyses were performed in the GraphPad Prism software version 6.0.