Research Papers:

Precision and recall oncology: combining multiple gene mutations for improved identification of drug-sensitive tumours

Stefan Naulaerts, Cuong C. Dang and Pedro J. Ballester _

PDF  |  HTML  |  Supplementary Files  |  How to cite

Oncotarget. 2017; 8:97025-97040. https://doi.org/10.18632/oncotarget.20923

Metrics: PDF 2271 views  |   HTML 9798 views  |   ?  


Stefan Naulaerts1,2,3,4, Cuong C. Dang5 and Pedro J. Ballester1,2,3,4

1Computational Biology and Drug Design, Cancer Research Center of Marseille, INSERM U1068, Marseille, France

2Institut Paoli-Calmettes, Marseille, France

3Aix-Marseille Université, Marseille, France

4CNRS UMR7258, Marseille, France

5Faculty of Information Technology, VNU University of Engineering and Technology, Hanoi, Vietnam

Correspondence to:

Pedro J. Ballester, email: [email protected]

Keywords: biomarker discovery, machine learning, drug sensitivity, genomics, cancer

Received: March 16, 2017     Accepted: August 14, 2017     Published: September 15, 2017


Cancer drug therapies are only effective in a small proportion of patients. To make things worse, our ability to identify these responsive patients before administering a treatment is generally very limited. The recent arrival of large-scale pharmacogenomic data sets, which measure the sensitivity of molecularly profiled cancer cell lines to a panel of drugs, has boosted research on the discovery of drug sensitivity markers. However, no systematic comparison of widely-used single-gene markers with multi-gene machine-learning markers exploiting genomic data has been so far conducted. We therefore assessed the performance offered by these two types of models in discriminating between sensitive and resistant cell lines to a given drug. This was carried out for each of 127 considered drugs using genomic data characterising the cell lines. We found that the proportion of cell lines predicted to be sensitive that are actually sensitive (precision) varies strongly with the drug and type of model used. Furthermore, the proportion of sensitive cell lines that are correctly predicted as sensitive (recall) of the best single-gene marker was lower than that of the multi-gene marker in 118 of the 127 tested drugs. We conclude that single-gene markers are only able to identify those drug-sensitive cell lines with the considered actionable mutation, unlike multi-gene markers that can in principle combine multiple gene mutations to identify additional sensitive cell lines. We also found that cell line sensitivities to some drugs (e.g. Temsirolimus, 17-AAG or Methotrexate) are better predicted by these machine-learning models.

Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 License.
PII: 20923