SSCMDA: spy and super cluster strategy for MiRNA-disease association prediction
Metrics: PDF 1685 views | HTML 2035 views | ?
Qi Zhao1,2, Di Xie1, Hongsheng Liu2,3, Fan Wang4,5, Gui-Ying Yan6 and Xing Chen7
1School of Mathematics, Liaoning University, Shenyang, China
2Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Liaoning Province, Shenyang, China
3School of Life Science, Liaoning University, Shenyang, China
4School of Mechatronic Engineering, China University of Mining and Technology, Xuzhou, China
5Jiangsu Key Laboratory of Mine Mechanical and Electrical Equipment, China University of Mining and Technology, Xuzhou, China
6Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
7School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
Qi Zhao, email: [email protected]
Xing Chen, email: [email protected]
Keywords: microRNA; disease; association prediction; spy strategy; super cluster strategy
Received: July 18, 2017 Accepted: October 30, 2017 Published: December 01, 2017
In the biological field, the identification of the associations between microRNAs (miRNAs) and diseases has been paid increasing attention as an extremely meaningful study for the clinical medicine. However, it is expensive and time-consuming to confirm miRNA-disease associations by experimental methods. Therefore, in recent years, several effective computational models for predicting the potential miRNA-disease associations have been developed. In this paper, we proposed the Spy and Super Cluster strategy for MiRNA-Disease Association prediction (SSCMDA) based on known miRNA-disease associations, integrated disease similarity and integrated miRNA similarity. For problems of mixed unknown miRNA-disease pairs containing both potential associations and real negative associations, which will lead to inaccurate prediction, spy strategy is adopted by SSCMDA to identify reliable negative samples from the unknown miRNA-disease pairs. Moreover, the super-cluster strategy could gather as many positive samples as possible to improve the accuracy of the prediction by overcoming the shortage of lacking sufficient positive training samples. As a result, the AUCs of global leave-one-out cross validation (LOOCV), local LOOCV and 5-fold cross validation were 0.9007, 0.8747 and 0.8806+/-0.0025, respectively. According to the AUC results, SSCMDA has shown a significant improvement compared with some previous models. We further carried out case studies based on various version of HMDD database to test the prediction performance robustness of SSCMDA. We also implemented case study to examine whether SSCMDA was effective for new diseases without any known associated miRNAs. As a result, a large proportion of the predicted miRNAs have been verified by experimental reports.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.