Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression
Metrics: PDF 942 views | HTML 1382 views | ?
Qiang Su1, Wencong Lu2, Dongshu Du1,3, Fuxue Chen1, Bing Niu1,4 and Kuo-Chen Chou4,5,6
1College of Life Science, Shanghai University, Shanghai 200444, China
2Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
3Department of Life Science, Heze University, Shandong 274500, China
4Gordon Life Science Institute, Boston, MA 02478, USA
5Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
6Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Fuxue Chen, email: email@example.com
Dongshu Du, email: firstname.lastname@example.org
Keywords: aromatic compounds, tetrahymena pyriformis, QSAR, genetic algorithm, mRMR
Received: March 10, 2017 Accepted: March 30, 2017 Published: April 13, 2017
Toxicity evaluation is an extremely important process during drug development. It is usually initiated by experiments on animals, which is time-consuming and costly. To speed up such a process, a quantitative structure-activity relationship (QSAR) study was performed to develop a computational model for correlating the structures of 581 aromatic compounds with their aquatic toxicity to tetrahymena pyriformis. A set of 68 molecular descriptors derived solely from the structures of the aromatic compounds were calculated based on Gaussian 03, HyperChem 7.5, and TSAR V3.3. A comprehensive feature selection method, minimum Redundancy Maximum Relevance (mRMR)-genetic algorithm (GA)-support vector regression (SVR) method, was applied to select the best descriptor subset in QSAR analysis. The SVR method was employed to model the toxicity potency from a training set of 500 compounds. Five-fold cross-validation method was used to optimize the parameters of SVR model. The new SVR model was tested on an independent dataset of 81 compounds. Both high internal consistent and external predictive rates were obtained, indicating the SVR model is very promising to become an effective tool for fast detecting the toxicity.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 License.