Oncotarget

Advance Publications: Research Papers:

SIPEC: Systematic identification of self-interacting proteins with ensemble classifiers using evolutionary information

Lei Wang, Z.-H. You _, Shan-Wen Zhang, Tao Wang, Li-Ping Li and Ya-Ping Wu

PDF  |  HTML  |  How to cite

DOI pending

Metrics: PDF 393 views  |   HTML 916 views  |   ?  


Abstract

Lei Wang1,*, Z.-H. You1,*, Shan-Wen Zhang1, Tao Wang1, Li-Ping Li2 and Ya-Ping Wu1

1College of Information Engineering, Xijing University, Xi’an 710123, China

2Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China

*Joint First Authors

Correspondence to:

Z.-H. You, email: [email protected]

Shan-Wen Zhang, email: [email protected]

Keywords: self-interacting proteins (sips); disease; pernicious anemia; amino acids; evolutionary information

Received: August 05, 2017     Accepted: January 01, 2018     Published: January 11, 2018

ABSTRACT

As the center of most biological processes, Protein-Protein Interactions (PPIs) constitute the basis of the formation of biological mechanisms. Deregulation of PPIs results in many diseases including cancer and pernicious anemia. As a special type of PPIs, the Self-interacting Proteins (SIPs) occupy an important position in them. Although a large number of SIPs data have been generated by experimental methods, currently-detected self-interacting proteins cover only a small part of the complete network. Therefore, there is a great need for computational methods to efficiently and accurately predict SIPs. In the present study, we introduce a novel computational method based on protein sequence information to predict SIPs. More specifically, each protein sequence is converted to Position-Specific Scoring Matrix (PSSM) containing the evolutionary information. And then an effective feature extraction approach, namely, Auto Covariance (AC) is employed to construct a feature set. Finally, the improved Rotation Forest (RF) model is used to remove the noise of the feature set and give prediction results. When performed on yeast and human SIPs data sets, the proposed method can achieve high accuracies of 80.50% and 93.70%, respectively. Our method also shows a good performance when compared with the SVM classifier and other existing methods. Consequently, the proposed method can be considered to be a promising model to predict SIPs. In addition, for the purpose of further research in the future, the user-friendly web server is freely available to academic use at http://www.proteininteraction.cn/sip/.


Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 24141