Robust in-silico identification of cancer cell lines based on next generation sequencing
Metrics: PDF 1156 views | HTML 2133 views | ?
Raik Otto1, Christine Sers2,3, Ulf Leser1
1Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany
2Charité Universitätsmedizin Berlin, Institute of Pathology, Berlin, Germany
3DKTK, German Consortium for Translational Cancer Research, Partner Site, Berlin, Germany
Raik Otto, email: [email protected]
Keywords: cancer cell lines, next-generation sequencing, cell line-identification, DNA-sequencing, data-heterogeneity and incompleteness
Received: January 10, 2017 Accepted: March 01, 2017 Published: March 10, 2017
Cancer cell lines (CCL) are important tools for cancer researchers world-wide. However, handling of cancer cell lines is error-prone, and critical errors such as misidentification and cross-contamination occur more often than acceptable. Based on the fact that CCL today very often are sequenced (partly or entirely) anyway as part of the studies performed, we developed Uniquorn, a computational method that reliably identifies CCL samples based on variant profiles derived from whole exome or whole genome sequencing. Notably, Uniquorn does neither require a particular sequencing technology nor downstream analysis pipeline but works robustly across different NGS platforms and analysis steps. We evaluated Uniquorn by comparing more than 1900 CCL profiles from three large CCL libraries, embracing 1585 duplicates, against each other. In this setting, our method achieves a sensitivity of 97% and specificity of 99%. Errors are strongly associated to low quality mutation profiles. The R-package Uniquorn is freely available as Bioconductor-package.
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 License.