Research Papers:

Evaluating the bias of circRNA predictions from total RNA-Seq data

Jinzeng Wang, Kang Liu, Ya Liu, Qi Lv, Fan Zhang and Haiyun Wang _

PDF  |  HTML  |  Supplementary Files  |  How to cite

Oncotarget. 2017; 8:110914-110921. https://doi.org/10.18632/oncotarget.22972

Metrics: PDF 1958 views  |   HTML 2406 views  |   ?  


Jinzeng Wang1,2, Kang Liu1, Ya Liu1, Qi Lv1, Fan Zhang1,3 and Haiyun Wang1

1School of Life Sciences and Technology, Tongji University, Shanghai 200092, China

2National Research Center for Translational Medicine (Shanghai), Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China

3Clinical Translational Research Center, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai 200433, China

Correspondence to:

Haiyun Wang, email: [email protected]

Keywords: circular RNA; circRNA predictions; total RNA-Seq; CIRI, KNIFE

Received: July 08, 2017     Accepted: November 13, 2017     Published: December 06, 2017


CircRNAs are a group of endogenous noncoding RNAs. The quickly developing high throughput RNA sequencing technologies along with novel bioinformatics approaches have enabled researchers to systematically identify circRNAs and their biological functions in cells. Deep sequencing of rRNA-depleted RNAs treated with RNase R, which digests linear RNAs and leaves circRNAs enriched, is an efficient way to identify circRNAs. However, very few of RNase R treated data are at hand but a large amount of total RNA-Seq data with no sequencing costs is available, for circRNA predictions. In this study, we systematically investigated the prediction bias from total RNA-Seq data as well as the influence of sequencing depth, sequencing quality and single-end or paired-end sequencing strategy on the predictions. We also identified circRNA properties that may contribute to the improved prediction performance. Our analysis shows that circRNA predictions from total RNA-Seq data gain ~50% true positive. Sequencing error dramatically worsens the predictions, rather than single-end sequencing strategy or low sequencing depth. However, false positive can be carefully controlled by using data with good quality and narrowing down circRNAs guided by their properties.

Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 22972