Serum microRNA signatures and metabolomics have high diagnostic value in hepatocellular carcinoma

Background Many new diagnostic biomarkers have been developed for hepatocellular carcinoma (HCC). We selected two methods with high diagnostic value, the detection of serum microRNAs and metabolomics based on gas chromatography/mass spectrometry (GC/MS), and attempted to establish appropriate models. Methods We reviewed the diagnostic efficiencies of all microRNAs identified by previous diagnostic tests. Then we chose appropriate microRNAs to validate the diagnostic efficiencies, and determined the optimal combination. We included 66 patients with HCC and 82 healthy controls (HCs) and detected the expression of the microRNAs. GC/MS analysis was performed, and we used three multivariate statistical methods to establish diagnostic models. The concentration of alpha feto-protein (AFP) was determined for comparison with the novel models. Results 82 published studies and 92 microRNAs were ultimately included in this systematic review. Seven microRNAs were selected for further validation of their diagnostic efficiencies. Among which, miR-21, miR-106b, miR-125b, miR-182 and miR-224 had a significantly different expression in HCC patients. The combination of miR-21, miR-106b and miR-224 had the highest area under the curve (AUC) at 0.950 with a sensitivity of 80.3% and a specificity of 92.7%. The GC/MS analysis exhibited an excellent diagnostic value and the AUC reached 1.0. In comparison, the AUC of the traditional biomarker, AFP, was 0.755. Conclusion MicroRNAs and metabolomics shows promising potential as new diagnostic methods due to their high diagnostic value compared with traditional biomarkers.


INTRODUCTION
Hepatocellular carcinoma (HCC) has the sixth highest cancer morbidity and the second highest mortality rate worldwide.The ratio of deaths to new cases for liver cancer is 0.95 each year, while colorectal cancer, which has a better prognosis, is 0.51 [1].Currently, the diagnosis of HCC relies on biopsy, imaging reports (ultrasound B, CT or MRI) and alpha feto-protein (AFP), according to the American Association for the Study of Liver Diseases (AASLD) Practice Guidelines.However, the sensitivity and specificity of AFP is barely satisfactory [2], necessitating the discovery of circulating biomarkers with a higher diagnostic value.After screening a host of novel biomarkers, including DNAs, RNAs, proteins and low-molecular-weight metabolites [2,3], we selected two methodology: the detection of serum microRNAs and metabolomics based on gas chromatography/mass spectrometry (GC/MS), validated their diagnostic value and established appropriate models.
MicroRNAs are small, endogenous, non-coding RNAs that can regulate the expression of genes at the post-transcriptional level [4].MicroRNAs can be released into peripheral blood when liver cell damage occurs [5].During the past ten years, decades of studies have shown that diverse microRNAs possess great potential for the diagnosis of HCC.Therefore, it is essential to summarize the diagnostic efficiencies of these microRNAs via a systematic review.It is a pity that there are deficiencies in the published systematic reviews and meta-analyses.Some of these studies reviewed only one microRNA [6][7][8][9], while others conducted a meta-analysis including the whole diagnostic tests, but lacked the information on each microRNA [10][11][12][13].We tried to overcome these disadvantages by selecting seven microRNAs with high Youden indexes and area under the curve (AUC) values of the receiver operating curve (ROC) to develop a diagnostic panel.
Metabolomics is defined as the quantitative measurement of all small molecule metabolites in an organism at a specified time under specific environmental conditions [14].Rapid development in metabolomics has made it a promising technology in disease diagnosis and biomarker generation [15].Compared with other metabolomic techniques, such as nuclear magnetic resonance (NMR) and liquid chromatography/mass spectrometry (LC/MS), GC/MS has a more robust result and is widely used in metabolite identification based on its high sensitivity, peak resolution, and reproducibility [16].Several studies have reported the diagnostic value of metabolomics in HCC [17].We further validated the diagnostic accuracy of GC/MS analysis and compared the most frequently used statistical methods.

Study selection and literature characteristics
The initial search from the databases and other sources returned a total of 590 articles, of which, 226 were from PubMed, 271 were from Embase, and 93 were from the Chinese Biomedical Literature Database (CBM).After removing 131 duplicates, 372 irrelevant studies and five articles that failed to provide enough diagnostic information, 82 published studies were enrolled into this systematic review (Supplementary Table 1).A total of 6035 HCC patients and 8181 healthy control (HCs) were included.The characteristics of the 82 studies are displayed in Supplementary Table 2.

Diagnostic value of microRNAs in the literature
92 microRNAs were mentioned in the included articles, of which, 65 were studied in a single article.We conducted the meta-analyses to represent the diagnostic accuracy of the other 27 microRNAs.The details of their corresponding diagnostic value are shown in Table 1.

Publication bias
A Deeks' funnel plot was used to evaluate publication bias (Figure 1), and the P values of Deeks' tests was 0.08, which indicated no significant publication bias was observed in this analysis.

Study population
The clinical and pathological characteristics of the study participants are presented in Table 2.The age and gender ratio were significantly different between HCC patients and HCs, thus, a covariance analyses were conducted.The results suggested that age and gender ratio was unrelated to the expression of the microRNAs, scores of the components and concentration of AFP.

Expression of microRNAs
MiR-21, miR-106b, miR-125b, miR-130b, miR-182, miR-224 and miR-338 were selected through the systematic review.The results of the quantitative reversetranscription polymerase chain reaction (qRT-PCR) indicated that the serum levels of miR-21, miR-106b and miR-125b in the HCC patients were significantly higher than those in HCs, while those of miR-182 and miR-224 were significantly lower.As for miR-130b and miR-338, no significant difference was observed between HCC patients and HCs (Supplementary Table 3 and Figure 2).The expression of all of the seven microRNAs had no significant differences among four TNM stages (Kruskal-Wallis test, P > 0.05).

Diagnostic models established using microRNAs
Table 3 presents the cut-off value, sensitivity, specificity, Youden index and AUC of each microRNA and their combinations.The combination of miR-21, miR-106b and miR-224 had the highest AUC value at 0.950, with a sensitivity of 80.3% and a specificity of 92.7%.The cut-off value of the model was -8.99, according to the formula miR-21 × 2.271 + miR-106b × 1.647 + miR-224 × (-3.306).

Discrepant metabolites and total ion chromatogram
A total of 1118 features were extracted in this experiment.Seventeen significantly different metabolites are presented in Supplementary Table 4.The retention time (RT) in the total ion chromatograms was stable with no drift in all of the peaks, which indicated that the results were reliable.The upregulated or downregulated expression trend in the HCC patients versus the control group.The data on the sensitivity, specificity and AUC were obtained via the meta-analysis when the number of included articles was more than one.

Diagnostic models established using metabolomics
First, we performed the multivariate statistical analyses in all 1118 metabolites.In the principal component analysis (PCA) model, we extracted ten principal components, seven of whose eigenvalue were more than 1.0.We calculated the diagnostic parameters when fitting into one to ten principal components (Table 4).As shown, the AUC was higher as the number of the principal components fitted into the model were increased.We extracted one component in partial least squaresdiscriminate analysis (PLS-DA) and orthogonal partial least squares-discriminant analysis (OPLS-DA) model, respectively, and the AUC reached 0.89 and 1.0.
When the seventeen significantly different metabolites were used to diagnose HCC, the AUC reached 1.0.Further multivariate statistical analyses also displayed promising results.In the PCA model, we extracted five principal components, three of whose eigenvalue was more than 1.0.The AUC reached 1.0 when more than four principal components were included.Only one component was extracted in both of PLS-DA and OPLS-DA model, and the AUC both reached 0.996.
More diagnostic information regarding the multivariate statistical analyses is shown in Table 4 and Figure 3.

Diagnostic value of traditional tumor biomarkers
The AFP concentration was significantly different between HCC patients and HCs (Mann-Whitney U-test, P < 0.001).The median concentrations in the patients and HCs were 42.2 (range, 1.2 -> 60500) and 3.6 (range, 0.9 - 10.3) μg/L, respectively.The AUC of AFP was 0.755 (95% CI, 0.666 -0.843; sensitivity = 59.1%, specificity = 100.0%)when the cut-off value was 12.3 μg/L.When the cut-off value was 20 μg/L, which is the upper bound of 95% of healthy individuals, the sensitivity was 54.5%, and the specificity was still 100.0%.
The ROC curves of AFP, metabolomics and the combination of microRNAs are displayed in Figure 4.

DISCUSSION
Early diagnosis and treatment of HCC can improve patient survival is a well-established consensus.Thus, looking for new biomarkers is in the ascendant.Novel diagnostic biomarkers almost belong to gene mutations, single nucleotide polymorphisms (SNP), epigenetics, mRNAs, non-coding RNAs and proteins including GPC3, GP73, DKK1 [18,19].Screening via proteomics or metabolomics is also a feasible way to discover new biomarkers.After investigating the diagnostic efficiencies Abbreviations: HCC, hepatocellular carcinoma; HC, healthy control.The bold font indicates that the P value of each microRNA in the combination was less than 0.05 in the logistic regression.
Abbreviation: AUC, area under the curve; CI, confidence interval.and limitations of biomarkers, we selected serum microRNAs and GC/MS to validate their diagnostic value and establish appropriate models.Among the thousands of microRNAs that have been discovered, many have been testified for their diagnostic value in HCC [10].A general research routine is through screening microRNA microarray in a small sample size, then validating the results via qRT-PCR in a larger sample size.We reviewed the diagnostic value of each microRNA.Meta-analyses made the statistical power increase through the expansion of the included articles and sample sizes.
Based on the result of systematic review, we selected seven microRNAs with high AUC values or Youden indexes that were included in various articles.MiR-21, miR-106b, miR-125b, miR-182 and miR-224 had significantly different expression in HCC patients versus HCs.AUC higher than 0.7, miR-21 and miR-224 had potential to become independent diagnostic biomarkers of HCC.The combination of microRNAs further raised the diagnostic value and the combination of miR-21, miR-106 and miR-224 allowed the AUC to exceed 0.950.With miR-21, miR-106b, miR-125b, miR-182 and miR-224 combined, the AUC was 0.952.However, there was no significant difference between the above two combinations.
As circulating diagnostic biomarkers, microRNAs have advantages and disadvantages.Different from mRNAs, microRNAs are stable at room temperature and remains so after repeated freeze-thawing [20].In addition, compared with liver puncture, blood examination is non-Figure 3: Score plots of the GC/MS analysis in the hepatocellular carcinoma patients and healthy controls.○ represents the hepatocellular carcinoma group.▲ represents the healthy control group.The scatter plots of the principal component analysis (PCA) with two principal components for all metabolites (1A) and significantly different metabolites (1B).The line within the plot represents the optimal cut-off line.The strip charts of the partial least squares-discriminate analysis (PLS-DA) with the only component for all metabolites (2A) and significantly different metabolites (2B).The strip charts of the orthogonal partial least squares-discriminant analysis (OPLS-DA) with the only component for all metabolites (3A) and significantly different metabolites (3B).Abbreviations: PCA, principal component analysis; PLS-DA, partial least squares-discriminate analysis; OPLS-DA, orthogonal partial least squares-discriminant analysis; AFP, alpha feto-protein.Abbreviations: HCC, hepatocellular carcinoma; HC, healthy control; RT-PCR, reverse-transcription polymerase chain reaction; GC/MS, gas chromatography/mass spectrometry; PCA, principal component analysis; PLS-DA, partial least squares-discriminate analysis; OPLS-DA, orthogonal partial least squares-discriminant analysis; AFP, alpha feto-protein.
invasive.Nevertheless, the choice of internal/external reference RNA, the dosage of reagents and the operating process lacks standardization, therefore the cut-off value cannot be unified, and even the variation trend of the expression for some microRNAs are distinct.On the other hand, the etiology, such as hepatitis B virus or hepatitis C virus, may affect the expression of microRNAs.
As expected, the diagnostic efficiency of metabolomics is satisfactory, whether all detected metabolites or significantly different metabolites were included.As shown in Table 4 , when a PCA, PLS-DA or OPLS-DA model includes the same number of components, the OPLS-DA model had the highest AUC, and the PCA model ranks last.This conclusion can be explained from a mathematics perspective.PCA is non-supervisory, while PLS-DA and OPLS-DA are supervisory analysis methods.Based on PLS, OPLS further separates the orthogonal variables by an orthogonal signal correction and expands the differences between the two data matrices [21,22].Although the diagnostic value of the PCA model was not superior to that of the PLD-DA and OPLS-DA model when including the same number of components, the PCA can extract more principal components to increase the AUC.
The advantage of serum GC/MS analysis are high diagnostic value and non-invasive examination process.The statistical models, which are established by PCA, PLA-DA and OPLS-DA, are stable when the variables are numerous and the observations are little.Nevertheless, same as detecting the expression of microRNAs, the pretreatment process is not standardized, including the choice of the derivatization reagents and internal standard, the time of each step and the operating order.
In terms of the price, new biomarker detections are more expensive than traditional AFP, which costs only 5.2 dollars in China.Each sample detection for three microRNAs and the metabolic spectra costs approximately 20 and 72.5 dollars, respectively.Moreover, an abdominal enhancement CT and enhancement MRI are priced around 100 and 135 dollars, respectively.A liver puncture costs 44 dollars, excluding test-related room and nursing care charges.
In conclusion, the diagnostic value of the new models are higher than that of the traditional biomarker, AFP, without doubt.We suggest GC/MS analysis and a combination of microRNAs applied to the diagnosis of HCC, especially after the position diagnosis is made via imaging examination.

Study design
First, an electronic search of PubMed, Embase and the CBM databases was performed to identify relevant articles published up to July 6, 2017.The search strategy was (miRNA OR microRNA OR miR) AND ("liver neoplasms"[Mesh] OR "hepatocellular carcinoma" OR "liver cancer") AND (blood OR serum OR plasma OR circulating) AND (diagnosis OR diagnostic OR diagnose).In addition, we examined the reference lists in identified articles to included additional relevant studies.No language restrictions were imposed.
Secondly, we chose microRNAs with high AUC values and is included in numerous studies to establish a diagnostic model.The serum specimens from 66 HCC patients and 82 HCs were collected to detect the expression of microRNAs through qRT-PCR.
Next, we randomly selected 24 patients and 30 HCs from the cohort mentioned above and profiled their metabolomic signatures via GC/MS analysis.
Finally, we detected the serum concentration of the traditional tumor biomarker, AFP.The diagnostic efficiency was calculated and compared to the new models.The flowprocess diagram for the study is shown in Figure 5.

Inclusion and exclusion criteria of the literature
The inclusion criteria for the systematic review were as follows: (1) studies regarding microRNAs comparing HCC patients with HCs; (2) studies that employed blood specimens, including serum and plasma; and (3) qRT-PCR techniques.The exclusion criteria included: (1) failure to provide sufficient diagnostic information; (2) duplicate data from identical authorities; and (3) cell or animal studies, reviews and letters.

Data extraction
Two reviewers were independently responsible for study selection and data extraction.Data were retrieved from all included studies: (1) basic characteristics of the studies, including the first author, year of publication, country, ethnicity, sample size, mean age, gender, type of specimens, target microRNAs, and reference control; and (2) diagnostic parameters of the microRNAs, including expression variation, sensitivity, specificity, and AUC.

Patients and specimens
In this study, we included patients and HCs from Zhongshan Hospital, Fudan University between May 2015 and July 2015.The HCC patients were all definitively diagnosed in accordance with the AASLD Practice Guidelines.The patients were excluded if they had history of other malignant tumors or had received surgical operation, interventional therapy, radiotherapy or chemotherapy.Healthy individuals were identified by clinical manifestations, histories of illness and normal liver function.The serum samples were centrifuged for 10 min at 820 g and 4°C to remove cell debris, and the supernatants were immediately stored at −80°C until analysis.The concentration of serum AFP was measured via an electro-chemiluminescence immunoassay.www.impactjournals.com/oncotarget The protocol was approved by the Ethics Committee of Zhongshan Hospital of Fudan University, Shanghai.All participants provided a written informed consent.

RNA extraction and reverse transcription
2 μl of 25 fmol cel-miR-39 (Tiangen, Beijing, China) was added to 200 μl of serum samples as external reference.Total RNA was isolated simultaneously using the miRcute microRNA Isolation Kit (Tiangen, Beijing, China) abiding by the manufacturer's protocol [23].The optical density of the extracted total RNA was determined at 260 and 280 nm on a NanoDrop spectrophotometer (NanoDrop, Wilmington, DE, USA) to assess for concentrations and purities.
The extracted microRNA was polyadenylated with poly (A) polymerase in a 20-μl volume, and 6 μl of the poly (A) reaction solution was reversely transcribed to cDNA in another 20 μl with miRcute microRNA The First-strand cDNA Synthesis Kit (Tiangen, Beijing, China) according to the manufacturer's protocol.All procedures were carried out in triplicates to remove outliers.

Quantitative real-time PCR
The qPCR reaction was conducted with the miRcute microRNA qPCR Detection Kit (Tiangen, Beijing, China) on ABI PRISM 7500 Sequence Detection System (Applied Biosystems, Foster City, CA, USA).Each 20-μl qPCR reaction solution contained cDNA, 2× miRcute microRNA premix (with SYBR and ROX), the manufacturer-provided universal reverse primer, and a microRNA-specific forward primer (Tiangen, Beijing, China).The real-time PCR cycling conditions: 94°C for 2 min, 45 cycles at 94°C for 20 s, annealing at 60°C for 34 s, and extension at 72°C for 30 s.At the end of the real-time PCR reaction, a melting curve analysis was accomplished to ensure specific amplification of the expected PCR product.
The relative expression of the microRNAs was calculated from the equation log 10 (2 −ΔCt ) with cel-miR-39.The ΔCT was calculated by subtracting the CT values of the cel-miR-39 from those of the microRNAs of interest [23].

Specimen processing for metabolomics
200 μl of the serum samples were transferred into glass centrifuge tubes for GC/MS analysis.200 μl of 2-chloro-phenylalanine (0.3 g/L) served as internal standard.600 μl of methanol was added into each sample.The mixture was vortexed for 30 s, followed by incubation at -20°C for 10 min.The samples were then centrifuged for at 12000 × g and 4°C for 15 min.800 μl of the supernatant was collected individually from each sample into an ampoule bottle and evaporated to dryness under a stream of nitrogen gas at 50°C for approximately 30 min.200 μl of a methoxyamine pyridine solution (15 g/L) was subsequently added into the ampoule bottle.The mixture was vortexed for 2 min and incubated at 37°C for 1 hour.Then, 200 μl of bis-(trimethylsilyl)-trifluoroacetamide (BSTFA) plus 1% trimethylchlorosilane (TMCS) was added, and the mixture was again vortexed for 2 min and incubated at 100°C for 30min.The methanol, 2-chlorophenylalanine, methoxyamine and pyridine were obtained from Aladdin (Shanghai, China).BSTFA with 1% TMCS was purchased from Sigma-Aldrich (St. Louis, MO, USA).Each reaction sample was performed in duplicates.

GC/MS analysis
The GC/MS analysis was performed on an Agilent 6980 GC system equipped with a fused-silica capillary column (internal diameter: 30 m × 0.25 mm) and a 0.25μm HP-5MS stationary phase (Agilent, Shanghai, China).We used the same operational methods as our previous studies [24].
Meta-analyses were used to assess the accuracy of individual microRNAs for HCC diagnosis, based on its sensitivity, specificity and AUC of the summary receiver operator characteristic (SROC).Deeks' funnel plot was selected to evaluated publication bias.
A power analysis was used to calculate the number of cases and HCs in the microRNA validation phase.A Mann-Whitney U-test was used to compare the expression of microRNAs and concentration of AFP in HCC patients and HCs.A Kruskal-Wallis test was used to calculate the relationship between the expression of microRNAs and TNM stage.The diagnostic efficiencies of the microRNAs were determined by assessing the sensitivity, specificity and the AUC.A stepwise logistic regression was used to include microRNAs into the diagnostic model.
The metabolomic data were normalized with "XCMS" package in R software and then stored in a twodimensional matrix, including the RT, mass-to-charge ratio (MZ) and peak intensity.The metabolites were identified based on the National Institute of Standards and Technology (NIST) mass spectra library through RT and MZ [24].Significantly different metabolites were screened via the variable importance in the projection (VIP) value of the OPLS-DA model (> 1) and the P value of t-test (≤ 0.001).Multivariate statistical analyses, including the PCA, PLS-DA and OPLS-DA, were carried out via SIMCA-P in all metabolites and significantly different metabolites, respectively.A logistic regression was used to investigate the better diagnostic models by combinations of the components when more than one component was extracted.

Figure 1 :
Figure 1: Deeks' funnel plot for the assessment of publication bias.

Figure 2 :
Figure 2: Box plots for the expression of the seven microRNAs.The P values of miR-21, miR-106b, miR-125b, miR-130b, miR-182, miR-224 and miR-338 were < 0.001, 0.008, < 0.001, 0.224, 0.028, <0.001 and 0.070, respectively.The lines within the boxes represent the median values, and the edges of the boxes demonstrate the interquartile ranges.The lines outside the boxes demonstrate the 95% ranges.The points outside the boxes represent the values beyond the 95% ranges.

Figure 4 :
Figure 4: Receiver operating characteristic (ROC) curve.ROC curve of the combination of miR-21, miR-106b and miR-224, GC/MS analysis with three statistical methods for all metabolites and AFP for discriminating hepatocellular carcinoma patients from control subjects.The curve of PCA model was performed when including two principal components.

Table 4 : Diagnostic value of the gas chromatography/mass spectrometry analysis with multivariate statistical analysis methods Source of components Statistical method
Abbreviations: AUC, area under the curve; CI, confidence interval; PCA, principal component analysis; PLS-DA, partial least squares-discriminate analysis; OPLS-DA, orthogonal partial least squares-discriminant analysis.