Distinct distributions of genomic features of the 5’ and 3’ partners of coding somatic cancer gene fusions: arising mechanisms and functional implications
Metrics: PDF 724 views | HTML 1317 views | ?
Yongzhong Zhao1,2, Won-Min Song1,2, Fan Zhang3, Ming-Ming Zhou4, Weijia Zhang3, Martin J. Walsh1,4,5 and Bin Zhang1,2
1Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY 10029, USA
2Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, NY 10029, USA
3Department of Medicine, Icahn School of Medicine at Mount Sinai, NY 10029, USA
4Department of Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, NY 10029, USA
5Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Bin Zhang, email: firstname.lastname@example.org
Keywords: cancer somatic gene fusions, gene age, GC skew, DNA-RNA R-loops, somatic amplification
Received: April 02, 2016 Accepted: June 06, 2016 Published: July 20, 2016
The genomic features and arising mechanisms of coding cancer somatic gene fusions (CSGFs) largely remain elusive. In this study, we show the gene origin stratification pattern of CSGF partners that fusion partners in human cancers are significantly enriched for genes with the gene age ofEuteleostomes and with the gene family age of Bilateria. GC skew (a measurement of G, C nucleotide content bias, (G-C)/(G+C)) is a useful measurement to indicate the DNA leading strand, lagging strand, replication origin, and replication terminal and DNA-RNA R-loop formation. We find that GC skew bias at the 5 prime (5′) but not the 3 prime (3’) partners of CSGFs, coincident with the polarity feature of gene expression breadth that the 5’ partners are more ubiquitous while the 3’ fusion partners are more tissue specific in general. We reveal distinct length and composition distributions of 5’ and 3’ of CSGFs, including sequence features corresponded to the 5’ untranslated regions (UTRs), 3’ UTRs, and the N-terminal sequences of the encoded proteins. Oncogenic somatic gene fusions are most enriched for the 5’ and 3’ genes’ somatic amplification alongside a substantial proportion of other types of combinations. At the function level, 5’ partners of CSGFs appear more likely to be tumour suppressor genes while many 3’ partners appear to be proto-oncogene. Such distinct polarities of CSGFs at the evolutionary, structural, genomic and functional levels indicate the heterogeneous arsing mechanisms of CSGFs including R-loops and suggest potential novel targeted therapeutics specific to CSGF functional categories.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 License.