The anticipating value of PLK1 for diagnosis, progress and prognosis and its prospective mechanism in gastric cancer: a comprehensive investigation based on high-throughput data and immunohistochemical validation

Polo-like kinase 1 (PLK1) is a multi-functional protein and its aberrant expression is a driver of cancerous transformation and progression. To increase our understanding of the clinical value and potential molecular mechanism of PLK1 in gastric cancer (GC), we performed this comprehensive investigation. A total of 25 datasets and 12 publications were finally incorporated. Additional immunohistochemistry was conducted to validate the expression pattern of PLK1 in GC. The pooled standard mean deviation (SMD) indicated that PLK1 mRNA was up-regulated in GC (SMD=1.21, 95% CI: 0.65-1.77, P< 0.001). Similarly, the pooled odds ratio (OR) revealed that PLK1 protein was overexpressed in GC compared with normal gastric tissue (OR=12.12, 95% CI: 5.41-27.16, P<0.001). The area under the curve (AUC) of the summary receiver operating characteristic (SROC) curve was 0.86. Furthermore, our results demonstrated that GC patients with PLK1 overexpression were significantly associated with unfavorable overall survival (HR =1.54, 95% CI: 1.30–1.83, P<0.001), lymph node metastasis (OR = 1.78, 95% CI: 1.13–2.80, P=0.013) and advanced TNM stage (OR=1.48, 95% CI: 1.02-2.15, P=0.038). Altogether, 100 similar genes were identified by Gene Expression Profiling Interactive Analysis (GEPIA) and further with gene-set enrichment analysis. These genes were related to gene ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways relevant to the cell cycle. Gene set enrichment analysis (GSEA) indicated that PLK1 is associated with various cancer-related pathways. Collectively, this study suggests that PLK1 overexpression could play vital roles in the carcinogenesis and deterioration of GC via regulating tumor-related pathways.


INTRODUCTION
Gastric cancer (GC) is the fourth leading human cancer, and it ranks as the second most common cause of tumor-related mortality all over the world, seriously threatening human health [1]. It is estimated that approximately 28,000 newly diagnosed cases and 10,960 deaths will occur in the United States in 2017 [2]. Although diagnostic and therapeutic techniques of GC have made advances over the past decades, the mortality rate is still fairly high due to its aggressive behavior [3][4][5][6]. Thus, a number of researchers have focused on probing several molecular biomarkers to modify clinical management strategies and better understand the molecular mechanism of GC [7][8][9]. Because clinically applicable biomarkers are fairly meager, exploring novel, effective molecular biomarkers to elucidate effective therapeutic targets for GC patients is still imperative.
Polo-like kinase 1 (PLK1), also known as serine/ threonine-protein kinase 13(STPK13), is a multi-faceted regulator of the cell cycle [10,11]. Due to PLK1's broad biological functionality, it is widely deemed as an oncogene and implicated in a broad range of malignant human tumors, including breast [12], liver [13], and colorectal cancers [14]. Prior to our study, increasing evidence has also suggested that dysregulated expressions of PLK1 exerted indispensable functions in GC progression. For example, Otsu H suggested that GC patients with high expression of PLK1 and DNA aneuploidy had inferior survival outcome [15]. Elevated PLK1 promotes GC cell metastasis rates and epithelial-mesenchymal transition by regulating the activation of the protein kinase B pathway [16]. Despite several independent studies providing various valuable perspectives of PLK1 in GC, low numbers of studies have led to a limited ability to uncover the complexity of GC and no meta-analysis to clarify the reliability and extent of its clinical value in GC has been performed.
Here, we collected all available datasets presenting the PLK1 expression pattern and the clinicopathological significance in GC from the Gene Expression Omnibus (GEO), Oncomine, The Cancer Genome Atlas (TCGA) and the literatures. Simultaneously, immunohistochemistry (IHC) staining was performed using clinical specimens from in-house GC patients to validate the expression pattern of PLK1. Subsequently, all-inclusive information related to PLK1 was achieved, and we proceeded with a systematic investigation using a meta-analysis to draw a comprehensive conclusion. Going further still, potential regulation mechanisms of PLK1 were analyzed using bioinformatics methods. Through these means, we unveil a key role for PLK1 as a GC promoter.

RESULTS
The present study contained several procedures sequentially ( Figure 1). After screening and inspection, a total of 25 datasets and 12 publications [15,[17][18][19][20][21][22][23][24][25][26][27] were incorporated into the present study. Among them, 19 datasets and 2 articles offered the expression value of PLK1 mRNA in GC and control groups were included (Table 1). Additionally, a total of 10 datasets and 11 publications that were used for investigating the prognostic value and clinicopathological significance of PLK1 were included and summarized in Table 2.

The expression level of PLK1 in GC via various databases
First, by mining online databases (TCGA, GEO, and Oncomine databases), 19 datasets were obtained, which provided PLK1 mRNA expression values in GC tissues and adjacent non-tumor tissues. We investigated the expression pattern of PLK1 in gastric cancer based on each independent dataset. In Figure 2, 3, 4 and 5, the PLK1 expression pattern of each data set was displayed in the form of scatter plots and receiver operating characteristic (ROC) diagrams. Then, we detected the expression of PLK1 protein in 43 pairs of GC tissues by IHC staining. Among these GC samples, the immunoreactivity score (IRS) indicated that PLK1 protein expression was significantly higher in the 43 GC samples (9.42±3.06) than in adjacent non-tumor tissues (7.02±3.17, P=0.001, Figure 6).

Verification of PLK1 up-regulation in GC via a meta-analysis
As some individual studies were too small to yield a valid conclusion, we integrated all of the data. A randomeffects model was selected, as apparent heterogeneity existed among the 21 studies, which were listed in Table  1 (I 2 = 96.1%, P<0.001; Figure 7). The pooled standard mean deviation (SMD) of PLK1 mRNA was 1.21 (95% CI: 0.65-1.77, P< 0.001; Figure 7), which suggested that PLK1 mRNA was remarkably up-regulated in GC. Additionally, the pooled odds ratio (OR) calculated based on 7 articles and our IHC staining results revealed that PLK1 protein expression was also up-regulated in GC compared to normal gastric tissue (OR=12.12, 95% CI: 5.41-27.16, P<0.001, random effect model; Figure 8) for heterogeneity (I 2 =67%, P=0.002).
Sensitivity analysis suggested that the pooled SMD was stable ( Figure 9A). Furthermore, Begg's funnel plot and Egger's test were carried out to visualize the publication bias. Begg's regression plot showed no potential publication bias (P= 0.139, Figure 9B). Equally, Egger's test indicated that no publication bias was found for the PLK1 overexpression in GC (P= 0.134). In summary, these current results confirmed the overexpression of PLK1 in GC.
To further identify the capability of PLK1 in discriminating cancer from non-cancerous gastric tissues, we generated a summary receiver operating characteristic (SROC) curve and then calculated the area under the curve (AUC). A total of 28 groups of data obtained from TCGA, Oncomine, GEO, publications and IHC staining were summarized. The overall AUC of PLK1 in GC was 0.86 (95% CI: 0.82-0.88), with a sensitivity and specificity of 0.82 (95% CI: 0.72-0.89) and 0.75 (95% CI: 0.62-0.84), respectively ( Figure 10).

PLK1 overexpression and clinicopathological features in GC patients
For clinical parameters, we performed integrative analyses and revealed that PLK1 overexpression in GC patients was associated with several clinicopathological

Evaluation of the prognostic value of PLK1 in GC
After screening, ten studies assessed the correlation between overexpressed PLK1 and overall survival (OS), with three studies containing information regarding the association between up-regulated PLK1 and recurrence free survival (RFS). Additionally, one study had sufficient information to calculate the relationship between high PLK1 expression and progression-free survival (PFS). It indicated that PLK1 overexpression was markedly correlated with worse OS (hazard ratio (HR): 1.51, 95% CI: 1.14-1.99, P=0.004; Figure 11A) with significant heterogeneity (I 2 =59.3%, P=0.009) when employing the random-effect model. Furthermore, a sensitivity analysis was performed, and the result implied the pooled HR was stable ( Figure 12A). As shown in Figure 12B, Begg's regression plot revealed no evidence of potential publication bias (P = 0.641). Additionally, Egger's test indicated no statistically significant publication bias was found for the positive PLK1 expression on OS (P = 0.721). To minimize heterogeneity, we omitted the article written by Jang YJ, as the overexpression of PLK1 was significantly correlated with a shorter OS (HR =1.54, 95% CI: 1.30-1.83, P<0.001; Figure 11B) with no heterogeneity (I 2 = 36.2%, P =0.129). The overexpression of PLK1 clearly led to an inferior PFS (HR: 1.47, 95% CI: 1.01-2.14). However, the result had low credibility as only 1 study provided data. Additionally, the RFS between high and low expression levels of PLK1 did not differ significantly (HR: 1.05, 95% CI: 0.67-1.66, P=0.834).

Relevant genes of PLK1 and gene-annotation enrichment analysis
The 100 genes most relevant to PLK1 in GC were obtained using Gene Expression Profiling Interactive Analysis (GEPIA) as described. Determining the biological value of these genes could provide important information on the mechanism of PLK1 in GC. These 100 relevant genes were input into DAVID to perform GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. As shown in Figure 13, GO analyses were implemented in three categories, including biological process (BP), cellular component (CC) and molecular function (MF). For BP, the most notably enriched functional terms were cell cycle, M phase, and cell cycle phase (P<0.001). Regarding CC, genes markedly assembled at condensed chromosome, chromosome and microtubule cytoskeleton (P<0.001). On the basis of MF, genes prominently accumulated in ribonucleotide binding, purine ribonucleotide binding and purine nucleotide binding (P<0.001). Additionally, these genes in the KEGG enrichment analysis were shown to be particularly related to the cell cycle (P<0.001, Figure 14).

Protein-protein interactions (PPI) network construction and the hub genes identification
To further insight into the molecular mechanism of PLK1, we used a functional protein network database STRING to construct the PPI network. Through the constructed PPI network, nine hub genes of PLK1 were obtained ( Figure 15A). These genes were MAD2L1, CHEK1, CDC45, CDC20, CCNB2, CCNB1, CCNA2, BUB1B and BUB1. Then, we also provided expression matrix plots of these hub genes ( Figure 15B). Furthermore, the correlations between PLK1 and hub genes were calculated ( Figure 15C-15K). These nine genes were positively correlated with PLK1.

Identification of PLK1-mediated molecular functions in GC by GSEA
To identify molecular functions of PLK1 in GC in-depth, we performed Gene Set Enrichment Analysis (GSEA) using TCGA data. Among all of the predefined hallmark gene sets, a total of 9 items were most commonly associated with PLK1 overexpression in the TCGA cohort (FDR q-value<0.01), including MTORC1 signaling, E2F targets and the G2M checkpoint, etc. (Figure 16), suggesting that PLK1 may be deeply engaged in GC tumorigenesis and progression through several cancerassociated signaling pathways.

DISCUSSION
Identification of risk factors and accurate prognostic prediction in GC patients are critical to selecting appropriate therapies and guiding clinical follow-up [28][29][30]. Unfortunately, accurately identifying the risks in individual patients is relatively difficult [31][32][33]. Although accumulating advances have been achieved to further understand the clinical value of PLK1 in GC, the conclusions have been controversial. Therefore, a comprehensive view of PLK1 in GC is needed. Taking advantage of the vast publicly available databases, we sought to unravel a crucial function of PLK1 in GC via excavating the Oncomine, TCGA, and GEO databases, published literature and IHC staining results in our hospital. Here, we first gathered all-inclusive data to confirm that PLK1 is markedly up-regulated in GC patients. Furthermore, we observed significant differences between PLK1 high and low expression levels regarding tumor pathological stage. More importantly, GC patients with PLK1 overexpression had inferior survival outcome and a significantly increased risk of lymph node  metastasis. Through the bioinformatics analysis, we firmly verified that PLK1 played a vital part in the regulation of the cell cycle and contributed to cancer-related signaling pathways, thus contributing to the stepwise tumorigenesis and progression in GC. Based on the above evidence, cautious monitoring of PLK1 could be clinically useful for decision-making in GC.
The first intention of the study was to clarify the expression pattern of PLK1 in GC. As expected, a pooled SMD reached 1.21 (95% CI: 0.65-1.77, P<0.001) presented by the random-effects model and no publication bias was observed. Similar to PLK1 mRNA, the pooled OR also revealed remarkably increased expression of PLK1 protein in GC, in comparison to normal gastric tissue. Furthermore, a robust ability of PLK1 to distinguish cancer from non-cancerous gastric tissues was observed via ROC (AUC: 0.86; 95% CI: 0.82-0.88, P<0.001). These demonstrated that the expression levels of PLK1 in GC tissues were significantly overexpressed. As a consequence, PLK1 might act as a cancer promoting factor and participate in GC tumorigenesis.
To further evaluate the clinical relevance and prognostic value of PLK1 in GC patients, we analyzed, for the first time, the remarkably increased expression of PLK1 as an unfavorable prognostic factor for OS in GC patients using a meta-analysis. The prognostic value of PLK1 in cancers is currently high profile. Zhang et al performed a meta-analysis using eleven eligible articles        and demonstrated that an increased expression of PLK1 indicated a higher risk of worse survival in breast cancer patients [12]. Liu et al compared the OS time between higher PLK1 and lower PLK1 groups in 25 cancer types based on the data provided by TCGA. The study revealed that cases with higher PLK1 suffered remarkably worse OS as compared to those with lower PLK1 in 10 cancer types: adrenocortical carcinoma, bladder urothelial carcinoma, breast invasive carcinoma, kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma, brain lower-grade glioma, lung adenocarcinoma, pancreatic adenocarcinoma, skin cutaneous melanoma, and uterine corpus endometrial carcinoma [34]. For GC, the results from our study fully proved that PLK1 overexpression may be an indispensable biomarker for the accurate assessment of prognosis. Additionally, this meta-analysis documented that elevated PLK1 was closely linked with advanced TNM stage and lymph node metastasis. Metastasis is a major death cause in cancers and complicates treatment opportunities [35]. Hence, an in-depth comprehension of the molecular mechanism of PLK1 in tumor metastasis and inferior prognosis may pave the way toward new prognostic models. The mechanisms responsible for PLK1 in GC are still not clear. We collected 100 genes that had a similar expression pattern to PLK1 in GC and conducted GO and KEGG pathway enrichment analysis. GO enrichment analysis indicated the possible functions of PLK1 in GC, and the results from three GO terms and KEGG pathway analysis determined that PLK1 was mainly involved in the cell cycle, which might be the mechanism by which PLK1 exerts its versatile and critical biological function in GC. Consistent with the results from GO and KEGG analysis, PLK1, as a master controller of the cell cycle, has been widely recognized [36,37]. Several studies have identified that PLK1 could impair the apoptosis, and enhance mitosis, cell growth and metastasis potential of GC cells. For example, in one of the GC cell lines, SNU-638, Jang YJ [19] found that depletion of PLK1 inhibited cell proliferation and caused apoptosis. Interestingly, Chen XH [38] also confirmed similar results by blockage of PLK1 expression in GC cells MKN45. To further probe into the molecular mechanism of PLK1, we obtained 9 hub genes by using PPI analysis. These hub genes were MAD2L1, CHEK1, CDC45, CDC20, CCNB2, CCNB1, CCNA2, BUB1B and BUB1, which were all upregulated in GC significantly and were positively related to PLK1. Hence, it could be speculated that PLK1 regulates cell cycle pathways by co-operating with these nine genes, and thereby facilitates the carcinogenesis and development of GC. Based on these findings, PLK1 is regarded as a key player in orchestrating the cell cycle, leading to the progression of GC.
In addition to GO and KEGG analysis, the results from GSEA suggested that MTORC1 signaling was most closely related with elevated PLK1 expression in GC patients. Today, most studies on PLK1 and MTORC1 signaling have been separate, and little is known about the crosstalk between them. Recently, Ruf et al proposed that PLK1 inhibited MTORC1 and thereby positively contributed to autophagy in HeLa cells [39]. The role of autophagy in cancer is one that has been highly researched but is still deeply mysterious. Cai et al [16] demonstrated that PLK1 drove gastric tumor epithelial-mesenchymal  transition (EMT) via targeting AKT signaling. AKT is also a key part of the MTORC1 signaling pathway and is closely related to metastasis, proliferation and invasion processes of gastric cancer [40][41][42]. All of the findings above lead to the opinion that PLK1 may be involved in autophagy in GC.
Molecularly targeted therapy and prognosis assessment have opened up new avenues for clinical cancer treatment. For molecular therapy, several preclinical studies with PLK1 inhibitors are underway [43][44][45]. Thus, we speculated that the combinatorial targeting of MTORC1 or AKT and PLK1 may be more meaningful and efficient for clinicopathological and prognostic estimation. Although some preclinical studies have been terminated due to side effects, looking for efficient and specific PLK1 molecular inhibitors will be the next priority for the clinical treatment of tumors. This truly pleiotropic kinase, assuredly, will continue to fascinate in the future.
However, the results should be interpreted cautiously due to some limitations that may have influenced the reliability of our conclusions. First, PLK1 expression was uniformly analyzed by various detection methods and diverse RNA detection platforms. This may be caused by the lack of standardization, which may result in inaccurate results. Therefore, a large multicenter study is needed to determine the most suitable detection method of PLK1. Consequently, the random-effects model was selected to reduce the impact of the heterogeneity on our results. Second, as some studies did not provide accurate original survival data, some of the survival data were indirectly extracted from Kaplan-Meier curves by Engauge Digitizer. Accordingly, the corresponding HR and 95% CI may lack credibility. Third, specific signaling pathways regulated by PLK1 in GC still must be validated using in vitro or in vivo experiments.
Taken together, the study proposes that PLK1 is a valid prognostic marker and will play a significant role

Data mining
Microarray or RNA-seq datasets, which were available for PLK1 expression pattern and relevant clinical information appraisal in GC, were downloaded and extracted from TCGA data portal (https://cancergenome. nih.gov/), GEO (https://www.ncbi.nlm.nih.gov/geo/) and Oncomine (https://www.oncomine.org/resource/main. html). In addition, a computer-aided systematic literature search was also performed in electronic databases PubMed, Embase, MEDLINE, Web of Science, Cochrane library, China National Knowledge Infrastructure (CNKI), and Wanfang, terminating in June 2017. The following keywords were used: ("polo-like kinase 1" OR STPK13 OR PLK1) AND (malignan * OR cancer OR tumor OR tumour OR neoplas * OR carcinoma) AND (gastric OR stomach). Cited references from identified primary studies or reviews were also manually scanned to avoid missing extra related studies.

Data selection principles
For microarray and RNA-seq datasets, eligible records were included if they fulfilled all of the principles as follows: (1) there was a proven diagnosis of GC in humans; (2) data included expression profiling data of PLK1 in GC and could be used for analysis; (3) the number of samples included in each record containing PLK1 expression value in tumor and non-tumor was not less than 10; and (4) the records used for analyzing the clinicopathological significance of PLK1 included more than 30 samples.
All eligible published studies were required to match the criteria listed below: (1) diagnosis of GC was confirmed pathologically; (2) the clinical value of the PLK1 in GC patients was reported; (3) the study was original; (4) an OR with 95% confidence interval (CI) of clinicopathological parameter or a HR and their 95% confidence intervals (CIs) could be obtained from the article directly or estimated based on the information in the paper sufficiently; and (5) studies were published as a full-text paper in English or Chinese, although no language restrictions were imposed initially.

Data extraction
Two investigators reviewed the included datasets and publications and extracted the relevant data for the study independently. Ambiguous or unclear details were determined through discussion with the third investigator. For datasets and literature relating to the expression level of PLK1, the following parameters were retrieved: first author, year of publication, country, data source, test method or platform, expression values of PLK1, sample size in both cancer and control groups and AUC value. To further confirm its dysregulated expression, true positive (TP), false positive (FP), false negative (FN) and true negative (TN) were also mined directly or assessed by the ROC curve analysis (data not shown).
For records on the prognostic role or clinicopathological significance of PLK1, the extracted characteristics comprised first author, data source, year of publication, the region of publication, sample size, test method or detection platform of PLK1 expression, cutoff values, type of survival data and HR with its 95% CI. Because multivariate analysis takes multi-parameterized associations into consideration, it would be more accurate [46]. Therefore, when univariate and multivariate analysis were presented simultaneously, we choose the latter. HRs and their 95% CIs were extracted from the original studies directly if provided. If HRs were not reported in the original study directly, the data were estimated from the Kaplan-Meier (K-M) survival curves by the software Engauge Digitizer version 4.1 (http://markummitchell. github.io/engauge-digitizer/) [47]. Additionally, we also calculated HRs using univariate and multivariate Cox analyses based on the expression level of PLK1 and follow-up data provided by datasets. ORs with their 95% CIs were assessed and the correlation between high expression of PLK1 and general clinicopathological parameters, including depth of tumor invasion (T1+T2 vs. T3+T4), gender (female vs. male), age (<60 vs.≥60), lymph node metastasis (negative vs. positive), histology grade (G1+G2 vs. G3+G4), tumor TNM stage (I+II vs. III+IV) and metastasis (negative vs. positive) were evaluated. For those datasets that provided PLK1 expression value and clinical information, we extracted the requisite data by dividing patients into high PLK1 expression and low PLK1 expression groups based on selecting the median value of PLK1 as a cut-off value.

Immunohistochemistry staining and evaluation
The expression level of PLK1 protein was detected by IHC staining based on 43 GC patients and their adjacent normal gastric tissues, which were obtained from the First Affiliated Hospital of Guangxi Medical University, People's Republic of China from January 2016 to April 2017. Formalin-fixed paraffin-embedded tissue samples were prepared into 4-μm-thick tissue sections. Then, the sections were dewaxed. Antigen retrieval was performed by pressure cooking at 95°C for 1 h. The samples were incubated with the first PLK1 rabbit polyclonal antibody diluted 1:500 (Abcam) at 37°C for 1 h. The rest procedure of immunohistochemistry was performed as the manufacturer's instruction introduced, and the final results were determined by two pathologists independently (Yi-wu Dang and Gang Chen). To investigate the regional differences in staining, the IRS was applied. Under light microscopy, 10 typical high-power visual fields were observed at random. The following two parameters were evaluated including the staining intensity and percentage of cells being stained in each sample to calculate the final IRS. The staining intensity was recorded as 0 if no staining was observed, 1 for weak staining, 2 for moderate staining and 3 for strong staining. Meanwhile, the percentage of cells stained was recorded as 0 if no cells were stained, 1 for <10% of stained cells, 2 for 11-50% of stained cells, 3 for 51-80% of stained cells, and 4 for more than 80% of stained cells. The above two scores were multiplied and an IRS ranged from 0 to 12 was generated. All the GC patients were then divided into two groups: PLK1 negative (IRS<6) and PLK1 positive (IRS≥6) [15].

Statistical analysis for meta-analysis
All microarray or RNA-seq datasets downloaded from public databases were log2 transformed for further analysis. First, we organized the expression level of PLK1 in carcinomas and controls of each record and presented them as the means ± SD (standard deviation). The PLK1 expression pattern of each data set was visualized by scatter plots and ROC diagrams. Then, the pooled SMD with 95% CI was calculated. An observed SMD>0 favored that PLK1 had a higher expression level in cancerous than that in non-cancerous samples, and statistical significance could be considered if the 95% CI did not cross 0. To further determined the ability of PLK1 in differentiating GC tissues from controls, SROC curve was generated and AUC was obtained with the sensitivity and specificity. The value of PLK1 expression on GC clinical outcome was evaluated through the pooled HR and its 95% CI. A pooled HR>1 illustrated a poor prognosis in GC patients with overexpression of PLK1. Meanwhile, 95% CI did not overlap 1, suggesting a significant association. ORs with 95% CI were used to elucidate the relationship between PLK1 expression and clinicopathological parameters.
Chi-squared-based Q-test and I-square (I 2 ) tests were carried out to analyze the heterogeneity across studies [48]. I 2 >50% or a P value less than 0.05 indicated heterogeneity among studies. We selected the method of a fixed effect model or a random effect model according to the heterogeneity analysis. To ensure the robustness of results, sensitivity analyses were performed by sequentially omitting individual studies to evaluate the impact of each dataset on the pooled results. Additionally, Begg's funnel plots and Egger's test were applied to determine the publication bias. If P>0.05, there was no significant publication bias. All above calculations were calculated by SPSS 20.0 (IBM, New York, USA) and Stata12.0 (Stata Corporation, College Station, TX, USA).

The relevant genes of PLK1 and functional enrichment analysis
The online database GEPIA (http://gepia.cancerpku.cn/index.html) is an interactive web server for analyzing the RNA sequencing expression based on TCGA and the GTEx projects [49]. We obtained a series of genes with similar expression patterns to PLK1 in gastric cancer via the database.
The gene ontology (GO) analysis was performed for the functional annotation of these related genes. The pathways that the related genes mainly participated in were investigated by KEGG pathway analysis. GO terms and pathways with a P value < 0.05 were significant. Both GO and KEGG pathway analyses were carried out in the Database for Annotation, Visualization and Integrated Discovery (DAVID). Enrichment maps visualizing the results were drawn by R software.

PPI network analysis and the hub genes
The online STRING 10.5 database (https://string-db. org/) is commonly used to analyze protein interactions. We constructed the PPI network by using the 14 similar genes which were enriched in the pathway of cell cycle. Hub genes were identified according to the numerical digit of the degrees of each node. And we obtained the matrix plots of the expression level of hub genes from TCGA via the online database GEPIA. The scatter plots of correlation between PLK1 and hub genes were also computed by GEPIA.

Gene set enrichment analysis
To further understand PLK1-related canonical pathways and biological processes in GC, GSEA was carried out (http://www.broad.mit.edu/gsea) [50]. All GC patients in the TCGA cohort were distributed into two groups based on the median expression value of PLK1 and the expression level of PLK1 was used as the phenotype label. For use with GSEA software, the collection of annotated gene sets of h.all.v6.0.symbols.gmt was chosen as the reference gene sets. FDR < 0.01 was used as the cut-off criteria.