Identifying and analyzing different cancer subtypes using RNAseq data of blood platelets

Yu-Hang Zhang; Tao Huang; Lei Chen; YaoChen Xu; Yu Hu; Lan-Dian Hu; Yudong Cai; Xiangyin Kong

doi:10.18632/oncotarget.20903

Oncotarget

Oncotarget (a primarily oncology-focused, peer-reviewed, open access journal) aims to maximize research impact through insightful peer-review; eliminate borders between specialties by linking different fields of oncology, cancer research and biomedical sciences; and foster application of basic and clinical science.

Its scope is unique. The term "oncotarget" encompasses all molecules, pathways, cellular functions, cell types, and even tissues that can be viewed as targets relevant to cancer as well as other diseases. The term was introduced in the inaugural Editorial, Introducing Oncotarget.

As of January 1, 2022, Oncotarget has shifted to a continuous publishing model. Papers will now be published continuously within yearly volumes in their final and complete form and then quickly released to Pubmed.

Subscribe to receive alerts once a paper has been published by Oncotarget.

Learn about our FREE

Post-Publication Promotion Services

Rapamycin Press LLC dba Impact Journals is the publisher of Oncotarget: www.impactjournals.com.

Impact Journals is a member of the Wellcome Trust List of Compliant Publishers.

Impact Journals is a member of the Society for Scholarly Publishing.

On December 23, 2022, Oncotarget server experienced a DDoS attack... This malicious action will be reported to the FBI.

Research Papers:

Identifying and analyzing different cancer subtypes using RNAseq data of blood platelets

PDF | Full Text | Supplementary Files | How to cite

Oncotarget. 2017; 8:87494-87511. https://doi.org/10.18632/oncotarget.20903

Metrics: PDF 3469 views | Full Text 9078 views

Yu-Hang Zhang1,2,*, Tao Huang2,*, Lei Chen4,*, YaoChen Xu5, Yu Hu2, Lan-Dian Hu2, Yudong Cai3 and Xiangyin Kong2

1Department of General Surgery, Shanghai Jiao Tong University Affiliated Sixth People’s Hospital, Shanghai 200233, People’s Republic of China

2Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China

3School of Life Sciences, Shanghai University, Shanghai 200444, People’s Republic of China

4College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People’s Republic of China

5Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China

*These authors have contributed equally to this work

Correspondence to:

Lan-Dian Hu, email: [email protected]

Yudong Cai, email: [email protected]

Xiangyin Kong, email: [email protected]

Keywords: cancer detection, liquid biopsy, RNA-seq data, support vector machine, maximum relevance minimum redundancy

Received: June 15, 2017 Accepted: August 16, 2017 Published: September 15, 2017

ABSTRACT

Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls.

Publication Alerts

Oncoscience

Post-Publication Promotion

Research Papers:

Identifying and analyzing different cancer subtypes using RNAseq data of blood platelets