Research Papers:

Controlling for cellular heterogeneity using single-cell deconvolution of gene expression reveals novel markers of colorectal tumors exhibiting microsatellite instability

Matthew A.M. Devall _ and Graham Casey

PDF  |  Full Text  |  Supplementary Files  |  How to cite  |  Press Release

Oncotarget. 2021; 12:767-782. https://doi.org/10.18632/oncotarget.27935

Metrics: PDF 1370 views  |   Full Text 3287 views  |   ?  


Matthew A.M. Devall1 and Graham Casey1

1 Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA

Correspondence to:

Graham Casey,email: [email protected]

Keywords: colorectal cancer; single-cell deconvolution; microsatellite instability; RNA-sequencing; enteroendocrine

Received: January 23, 2021     Accepted: March 22, 2021     Published: April 13, 2021

Copyright: © 2021 Devall and Casey. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Approximately 15% of colorectal cancer (CRC) cases present with high levels of microsatellite instability (MSI-H). Bulk RNA-sequencing approaches have been employed to elucidate transcriptional differences between MSI-H and microsatellite stable (MSS) CRC tumors. These approaches are frequently confounded by the complex cellular heterogeneity of tumors. We performed single-cell deconvolution of bulk RNA-sequencing on The Cancer Genome Atlas colon adenocarcinoma (TCGA-COAD) dataset. Cell composition within each dataset was estimated using CIBERSORTx. Cell composition differences were analyzed using linear regression. Significant differences in abundance were observed for 13 of 19 cell types between MSI-H and MSS/MSI-L tumors in TCGA-COAD. This included a novel finding of increased enteroendocrine (q = 3.71E-06) and reduced colonocyte populations (q = 2.21E-03) in MSI-H versus MSS/MSI-L tumors. We were able to validate some of these differences in an independent biopsy dataset. By incorporating cell composition into our regression model, we identified 3,193 differentially expressed genes (q = 0.05), of which 556 were deemed novel. We subsequently validated many of these genes in an independent dataset of colon cancer cell lines. In summary, we show that some of the challenges associated with cellular heterogeneity can be overcome using single-cell deconvolution, and through our analysis we highlight several novel gene targets for further investigation.

Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 27935