DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

Balachandran Manavalan; Tae Hwan Shin; Gwang Lee

doi:10.18632/oncotarget.23099

Oncotarget

Oncotarget (a primarily oncology-focused, peer-reviewed, open access journal) aims to maximize research impact through insightful peer-review; eliminate borders between specialties by linking different fields of oncology, cancer research and biomedical sciences; and foster application of basic and clinical science.

Its scope is unique. The term "oncotarget" encompasses all molecules, pathways, cellular functions, cell types, and even tissues that can be viewed as targets relevant to cancer as well as other diseases. The term was introduced in the inaugural Editorial, Introducing Oncotarget.

As of January 1, 2022, Oncotarget has shifted to a continuous publishing model. Papers will now be published continuously within yearly volumes in their final and complete form and then quickly released to Pubmed.

Subscribe to receive alerts once a paper has been published by Oncotarget.

Impact Journals, LLC is the publisher of Oncotarget: www.impactjournals.com.

Impact Journals is a member of the Wellcome Trust List of Compliant Publishers.

Impact Journals is a member of the Society for Scholarly Publishing.

On December 23, 2022, Oncotarget server experienced a DDoS attack. As a result, Oncotarget site was inaccessible for a few hours. Oncotarget team swiftly dealt with the situation and took it under control. This malicious action will be reported to the FBI.

Research Papers:

DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

Balachandran Manavalan _, Tae Hwan Shin and Gwang Lee

PDF | HTML | Supplementary Files | How to cite

Oncotarget. 2018; 9:1944-1956. https://doi.org/10.18632/oncotarget.23099

Metrics: PDF 1663 views | HTML 3310 views | ?

Abstract

Balachandran Manavalan1, Tae Hwan Shin1,2 and Gwang Lee1,2

1Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea

2Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea

Correspondence to:

Balachandran Manavalan, email: [email protected]

Gwang Lee, email: [email protected]

Keywords: DNase I hypersensitive site; feature selection; machine learning; random forest; support vector machine

Received: September 06, 2017 Accepted: November 17, 2017 Published: December 08, 2017

ABSTRACT

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html.

All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 23099

Publication Alerts

Research Papers:

DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

Abstract