Research Papers:

Systematic evaluation of supervised classifiers for fecal microbiota-based prediction of colorectal cancer

Luoyan Ai, Haiying Tian, Zhaofei Chen, Huimin Chen, Jie Xu and Jing-Yuan Fang _

PDF  |  HTML  |  Supplementary Files  |  How to cite  |  Order a Reprint

Oncotarget. 2017; 8:9546-9556. https://doi.org/10.18632/oncotarget.14488

Metrics: PDF 2064 views  |   HTML 2926 views  |   ?  


Luoyan Ai1,*, Haiying Tian1,*, Zhaofei Chen1,*, Huimin Chen1, Jie Xu1, Jing-Yuan Fang1

1Division of Gastroenterology and Hepatology, Shanghai Institute of Digestive Disease, Key Laboratory of Gastroenterology and Hepatology, Ministry of Health, State Key Laboratory for Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao-Tong University, Shanghai 200001, China

*These authors contributed equally to this work

Correspondence to:

Jing-Yuan Fang, email: jingyuanfang@sjtu.edu.cn

Jie Xu, email: xujieletter@gmail.com

Keywords: gut microbiota, CRC, supervised classifier, prediction

Received: October 27, 2016     Accepted: December 15, 2016     Published: January 04, 2017


Predicting colorectal cancer (CRC) based on fecal microbiota presents a promising method for non-invasive screening of CRC, but the optimization of classification models remains an unaddressed question. The purpose of this study was to systematically evaluate the effectiveness of different supervised machine-learning models in predicting CRC in two independent eastern and western populations. The structures of intestinal microflora in feces in Chinese population (N = 141) were determined by 454 FLX pyrosequencing, and different supervised classifiers were employed to predict CRC based on fecal microbiota operational taxonomic unit (OTUs). As a result, Bayes Net and Random Forest displayed higher accuracies than other algorithms in both populations, although Bayes Net was found with a lower false negative rate than that of Random Forest. Gut microbiota-based prediction was more accurate than the standard fecal occult blood test (FOBT), and the combination of both approaches further improved the prediction accuracy. Moreover, when unclassified OTUs were used as input, the BayesDMNB text algorithm achieved higher accuracy in the Chinese population (AUC=0.994). Taken together, our results suggest that Bayes Net classification model combined with unclassified OTUs may present an accurate method for predicting CRC based on the compositions of gut microbiota.

Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 License.
PII: 14488