An ensemble approach for large-scale identification of protein- protein interactions using the alignments of multiple sequences

Lei Wang; Zhu-Hong You; Xing Chen; Jian-Qiang Li; Xin Yan; Wei Zhang; Yu-An Huang

doi:10.18632/oncotarget.14103

Oncotarget

Oncotarget (a primarily oncology-focused, peer-reviewed, open access journal) aims to maximize research impact through insightful peer-review; eliminate borders between specialties by linking different fields of oncology, cancer research and biomedical sciences; and foster application of basic and clinical science.

Its scope is unique. The term "oncotarget" encompasses all molecules, pathways, cellular functions, cell types, and even tissues that can be viewed as targets relevant to cancer as well as other diseases. The term was introduced in the inaugural Editorial, Introducing Oncotarget.

As of January 1, 2022, Oncotarget has shifted to a continuous publishing model. Papers will now be published continuously within yearly volumes in their final and complete form and then quickly released to Pubmed.

Subscribe to receive alerts once a paper has been published by Oncotarget.

Impact Journals, LLC is the publisher of Oncotarget: www.impactjournals.com.

Impact Journals is a member of the Wellcome Trust List of Compliant Publishers.

Impact Journals is a member of the Society for Scholarly Publishing.

On December 23, 2022, Oncotarget server experienced a DDoS attack. As a result, Oncotarget site was inaccessible for a few hours. Oncotarget team swiftly dealt with the situation and took it under control. This malicious action will be reported to the FBI.

Research Papers:

An ensemble approach for large-scale identification of protein- protein interactions using the alignments of multiple sequences

Lei Wang, Zhu-Hong You _, Xing Chen, Jian-Qiang Li, Xin Yan, Wei Zhang and Yu-An Huang

PDF | HTML | Supplementary Files | How to cite

Oncotarget. 2017; 8:5149-5159. https://doi.org/10.18632/oncotarget.14103

Metrics: PDF 1419 views | HTML 2813 views | ?

Abstract

Lei Wang1,5,*, Zhu-Hong You2,*, Xing Chen3, Jian-Qiang Li4, Xin Yan6, Wei Zhang5, Yu-An Huang4

1School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China

2Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China

3School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China

4College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060, China

5College of Information Science and Engineering, Zaozhuang University, Zaozhuang, Shandong 277100, China

6School of Foreign Languages, Zaozhuang University, Zaozhuang, Shandong 277100, China

*Joint First Authors

Correspondence to:

Zhu-Hong You, email: [email protected]

Xing Chen, email: [email protected]

Keywords: disease, position-specific scoring matrix, multiple sequences alignments, cancer

Received: October 11, 2016 Accepted: November 15, 2016 Published: December 22, 2016

ABSTRACT

Protein–Protein Interactions (PPI) is not only the critical component of various biological processes in cells, but also the key to understand the mechanisms leading to healthy and diseased states in organisms. However, it is time-consuming and cost-intensive to identify the interactions among proteins using biological experiments. Hence, how to develop a more efficient computational method rapidly became an attractive topic in the post-genomic era. In this paper, we propose a novel method for inference of protein-protein interactions from protein amino acids sequences only. Specifically, protein amino acids sequence is firstly transformed into Position-Specific Scoring Matrix (PSSM) generated by multiple sequences alignments; then the Pseudo PSSM is used to extract feature descriptors. Finally, ensemble Rotation Forest (RF) learning system is trained to predict and recognize PPIs based solely on protein sequence feature. When performed the proposed method on the three benchmark data sets (Yeast, H. pylori, and independent dataset) for predicting PPIs, our method can achieve good average accuracies of 98.38%, 89.75%, and 96.25%, respectively. In order to further evaluate the prediction performance, we also compare the proposed method with other methods using same benchmark data sets. The experiment results demonstrate that the proposed method consistently outperforms other state-of-the-art method. Therefore, our method is effective and robust and can be taken as a useful tool in exploring and discovering new relationships between proteins. A web server is made publicly available at the URL http://202.119.201.126:8888/PsePSSM/ for academic use.

All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 14103

Publication Alerts

Research Papers:

An ensemble approach for large-scale identification of protein- protein interactions using the alignments of multiple sequences

Abstract