iPhos-PseEn: Identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier
Metrics: PDF 1956 views | HTML 2856 views | ?
Wang-Ren Qiu1,2, Xuan Xiao1,3, Zhao-Chun Xu1, Kuo-Chen Chou3,4,5
1Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
2Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia, MO, USA
3Gordon Life Science Institute, Boston, MA, USA
4Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
5Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Wang-Ren Qiu, email: [email protected]
Xuan Xiao, email: [email protected]
Kuo-Chen Chou, email: [email protected]
Keywords: protein phosphorylation, pseudo components, random forests, ensemble classifier
Received: April 05, 2016 Accepted: May 23, 2016 Published: June 13, 2016
Protein phosphorylation is a posttranslational modification (PTM or PTLM), where a phosphoryl group is added to the residue(s) of a protein molecule. The most commonly phosphorylated amino acids occur at serine (S), threonine (T), and tyrosine (Y). Protein phosphorylation plays a significant role in a wide range of cellular processes; meanwhile its dysregulation is also involved with many diseases. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of S, T, or Y, which ones can be phosphorylated, and which ones cannot? To address this problem, we have developed a predictor called iPhos-PseEn by fusing four different pseudo component approaches (amino acids’ disorder scores, nearest neighbor scores, occurrence frequencies, and position weights) into an ensemble classifier via a voting system. Rigorous cross-validations indicated that the proposed predictor remarkably outperformed its existing counterparts. For the convenience of most experimental scientists, a user-friendly web-server for iPhos-PseEn has been established at http://www.jci-bioinfo.cn/iPhos-PseEn, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.