Research Papers:
Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods
Metrics: PDF 2568 views | HTML 2788 views | ?
Abstract
Bin Liu1,2,3, Hao Wu1, Deyuan Zhang4, Xiaolong Wang1,2, Kuo-Chen Chou3,5
1School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
2Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
3Gordon Life Science Institute, Boston, Massachusetts, USA
4School of Computer, Shenyang Aerospace University, Shenyang, Liaoning, China
5Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
Correspondence to:
Bin Liu, email: [email protected], [email protected]
Kuo-Chen Chou, email: [email protected]
Keywords: sequence analysis, pseudo components, support vector machine, genome/proteome analysis
Received: December 02, 2016 Accepted: December 27, 2016 Published: January 05, 2017
ABSTRACT
To expedite the pace in conducting genome/proteome analysis, we have developed a Python package called Pse-Analysis. The powerful package can automatically complete the following five procedures: (1) sample feature extraction, (2) optimal parameter selection, (3) model training, (4) cross validation, and (5) evaluating prediction quality. All the work a user needs to do is to input a benchmark dataset along with the query biological sequences concerned. Based on the benchmark dataset, Pse-Analysis will automatically construct an ideal predictor, followed by yielding the predicted results for the submitted query samples. All the aforementioned tedious jobs can be automatically done by the computer. Moreover, the multiprocessing technique was adopted to enhance computational speed by about 6 folds. The Pse-Analysis Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/Pse-Analysis/, and can be directly run on Windows, Linux, and Unix.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 14524