Identification of CHEK1, SLC26A4, c-KIT, TPO and TG as new biomarkers for human follicular thyroid carcinoma

The search for preoperative biomarkers for thyroid malignancies, in particular for follicular thyroid carcinoma (FTC) diagnostics, is of utmost clinical importance. We thus aimed at screening for potential biomarker candidates for FTC. To evaluate dynamic alterations in molecular patterns as a function of thyroid malignancy progression, a comparative analysis was conducted in clinically distinct subgroups of FTC and poorly differentiated thyroid carcinoma (PDTC) nodules. NanoString analysis of FFPE samples was performed in 22 follicular adenomas, 56 FTC and 25 PDTC nodules, including oncocytic and non-oncocytic subgroups. The expression levels of CHEK1, c-KIT, SLC26A4, TG and TPO were significantly altered in all types of thyroid carcinomas. Based on collective changes of these biomarkers which correlating among each other, a predictive score has been established, allowing for discrimination between benign and FTC samples with high sensitivity and specificity. Additional transcripts related to thyroid function, cell cycle, circadian clock, and apoptosis regulation were altered in the more aggressive oncocytic subgroups only, with expression levels correlating with disease progression. Distinct molecular patterns were observed for oncocytic and non-oncocytic FTCs and PDTCs. A predictive score correlation coefficient based on collective alterations of identified here biomarkers might help to improve the preoperative diagnosis of FTC nodules.


INTRODUCTION
Thyroid carcinomas are the most common type of endocrine malignancies, with an incidence steadily increasing worldwide [1]. The classification of thyroid carcinomas is made according to cell origin, with welldifferentiated thyroid carcinomas (papillary thyroid carcinoma (PTC) and follicular thyroid carcinoma (FTC)) being the most frequent types. FTC is further sub-classified into oncocytic and non-oncocytic subtypes with distinct genomic, epigenomic, proteomic and clinical profiles, according to the Cancer Genome Atlas Research Network. Poorly differentiated and undifferentiated (anaplastic) thyroid carcinomas (PDTC and ATC) are less common but more aggressive [2]. Fine-needle aspiration (FNA) biopsy is the standard diagnostic test recommended for the clinical evaluation of thyroid non-secreting nodules ≥ 1 cm [3]. While FNA allows for the reliable recognition of most classical PTC cases, it stays indeterminate in about 20-30% of cases, mostly for malignant follicular lesions. The differentiation between benign follicular adenoma and FTC is virtually impossible based on cytological features, since the hallmark of malignancy in FTC is the presence of capsular or vascular invasion, which cannot be assessed by FNA. Therefore, surgery is generally recommended for these patients [3]. Postoperatively, the majority of indeterminate cases are classified as benign, revealing a significant rate of unnecessary surgeries, complications, and even morbidity [1].
Numerous studies have aimed to find predictive factors of malignancy before patients undergo surgery, including genetic analyses and search for molecular biomarkers [4]. A particular effort has been thus undertaken in the field to explore molecular alterations and genetic mutations, which may allow for the accurate pre-operative clinical diagnostics of FTC ( [5] and references therein). The presence of RAS point mutations or PAX8/PPARγ rearrangement in FTC might represent such diagnostic markers, with RAS demonstrating also a strong association with disease aggressiveness [6]. However, the sensitivity of these gene analyses is very low. Moreover, RAS or PAX8/PPARγ alterations are also found in a subset of follicular adenomas, therefore limiting their predictive value. Substantial efforts including large-scale screening studies have revealed numerous potential biomarkers for the preoperative diagnostic of FTC, however, none of those provide conclusive results for patients with indeterminate thyroid FNA cytology [7]. Therefore, the search for reliable preoperative markers of FTC cases with indeterminate cytology stays of utmost clinical importance.
Our recent work has allowed for the identification of new potential biomarkers for postoperative PTC FFPE samples [8] employing NanoString analysis [9]. Parallel assessment of changes in the expression levels of several biomarkers in the same sample has let us to establish a predictive score based on the combined changes of these candidate genes, and thus provides a more accurate diagnostic test compared to alterations of one transcript only. Moreover, cell cycle regulator CHEK1 and circadian clock component BMAL1 have been identified as potential biomarkers for PTC [8]. Employing the settings developed by us for the analysis of FFPE samples by NanoString [8], we now aimed at screening for potential biomarker candidates and molecular patterns in FTC, including different subgroups within the FTC category. To evaluate the dynamics of thus identified transcript alterations as a function of thyroid malignancy progression, the same analysis was conducted in PDTC nodules.

Transcriptional alterations in different subgroups of FTC nodules assessed by NanoString analysis
12 healthy thyroid samples and 22 benign thyroid nodules (follicular adenomas) obtained during planned thyroid surgeries (see Supplementary Table S1 for donor characteristics), were subjected to NanoString analysis. The panel of 22 genes (Supplementary Table S4), comprising those related to thyroid function, core clock, cell cycle and apoptosis, was analyzed (for gene selection details see Materials and Methods). NanoString analysis revealed that transcriptional pattern of the benign thyroid nodule was not significantly altered for any of 22 analyzed genes, in comparison to healthy thyroid tissue, with no transcript exhibiting significant difference in their false discovery rate (FDR) 5% (Supplementary Table S5).
In an attempt to identify the transcripts with altered expression levels upon FTC development, FTC samples (see Supplementary Table S2 for donor characteristics and postoperative diagnosis) were analyzed by NanoString. Expression levels of the same 22 genes (Supplementary Table S4) were compared between 56 FTC and 22 follicular adenomas described above. NanoString analysis revealed that the cell-cycle related transcript CHEK1 was upregulated 9-fold in FTC, compared to the benign counterpart ( To address a possible correlation between the oncocytic feature of FTC nodules, representing a more aggressive form of the disease, and clinical diagnostics we next compared transcript changes in non-oncocytic versus oncocytic subgroups. Furthermore, taking into account the clinical and molecular heterogeneity of FTC depending on the presence of capsular or vascular invasion, differential analysis of samples with and without capsular and vascular invasion was performed within each subgroup (Supplementary Table S2). As presented at Table 1 subgroup 2, FTC with non-oncocytic diagnostics and without capsular invasion exhibited a 7.7-fold upregulation of CHEK1 and a 2-fold downregulation of TPO. Similar results, with slightly stronger fold changes for both markers, were observed for non-oncocytic FTC with vascular invasion (

NanoString analysis of PDTC nodules
To address gene expression changes upon thyroid follicular carcinoma development, we next analyzed a group of 25 FFPE samples with postoperative PDTC diagnosis (Supplementary Table S3). NanoString analysis of 22 transcripts (Supplementary Table S4) was performed for these samples and compared to benign and FTC counterparts analyzed in parallel. The expression levels of CHEK1, c-KIT, SLC26A4, TG and TPO were altered in a more extreme manner in PDTC than in FTC (compare subgroups 1 in Tables 1, 2). In addition, the levels of DIO2, KDR, CDKN1B, FZD1, BCL2, PER2, CRY2 and SLC5A5 were strongly downregulated in PDTC. Next, transcript level changes in the oncocytic and nononcocytic PDTC were evaluated separately. Consistent with the trend observed in FTC, relatively milder alterations were observed in the non-oncocytic PDTC subgroup for CHEK1, DIO2, KDR, SLC26A4, SLC5A5, TG and TPO (

Alterations of CHEK1, c-KIT, SLC26A4, TG and TPO expression levels in FTC exhibit pair-wise correlations
Given that the NanoString approach allows for the assessment of numerous transcript levels within the same sample, we next performed pair-wise correlation analysis among the transcripts that exhibited the most pronounced alterations in FTC. Pair-wise correlation analyses of the combined set of 56 FTC samples enrolled in this study (Supplementary Table S2) revealed that alterations of CHEK1, c-KIT, SLC26A4, TG and TPO were significantly correlated (Table 3). Specifically, c-KIT, SLC26A4, TG and TPO exhibited moderate to strong positive correlations, while CHEK1 was weakly inversely correlated with rest of the transcripts (Table 3). Therefore, this group of transcripts represents a plausible cluster of biomarkers whose collective changes are associated with FTC development, and might thus be potentially predictive of FTC diagnosis.

RAS mutation analysis in the FTC samples
To acquire additional valuable characteristics of the FTC nodules analyzed in this study by NanoString, we conducted N-RAS61 and H-RAS61 mutation analysis [10] for the same FTC nodules. As listed in Table 4, 8.9% (5/56) of FTC samples exhibited the N-RAS61 mutation, in an agreement with previous reports [11]. A similar frequency of the N-RAS61 mutation (8.3%; 2/24) was observed for the oncocytic FTC subgroup. Surprisingly, one out of 22 benign samples exhibited an N-RAS61 mutation different from those detected in FTCs (Table 4). Of note, this particular type of N-RAS61 mutation (Ala59Pro (c.175G > C)) detected in the benign sample has never been described previously. With regard to H-RAS61, one sample was identified as positive within all FTC samples. This sample was classified in the oncocytic FTC group (Table 4). No H-RAS61 mutation was detected in the benign samples.

Predictive score for FTC diagnostics based on combined gene expression level changes
In an attempt to correlate the degree of expression level changes of CHEK1, c-KIT, SLC26A4, TG and TPO with the histological diagnosis, we aimed at establishing a gene expression-based predictive score, taking into account the collective biomarker changes [12,13]. A final predictive score was established for each biological sample, based on the expression levels of five distinctive genes (CHEK1, c-KIT, SLC26A4, TG and TPO), which exhibit stable changes in FTC and correlate among each other, using the Linear Prediction Score (LPS; for details see Supplementary Methods; [13]). To test the performance of the score, a receiver operating characteristic (ROC) analysis was performed, and ROC curves were established (see Supplementary Figure S1A and Supplementary Methods). Our results indicate that at a threshold of 0.725, based on empirical curve analysis (Supplementary Figure S1A, Supplementary Tables S6, S7), our predictive score discriminates FTC from benign cases with 96% sensitivity and 82% specificity. Of note, such discrimination was more sensitive for the oncocytic FTC cases compared to their non-oncocytic counterparts (significantly more false negatives for non-oncocytic than for oncocytic FTC, see Figure 1). Moreover, FTCs with vascular invasion exhibited the highest scores, if compared with their counterparts that do not bear vascular invasion ( Figure 1).
In spite of the relatively high fold changes in FTC, CHEK1 exhibited weak correlations with the rest of the identified biomarkers. To identify a cluster of biomarkers, which will give the most reliable predictive score, we tested predictive scores based on the combinations of c-KIT, SLC26A4, TG, TPO and SLC26A4, TG, TPO (Supplementary Figures S1B-S1C and S2A-S2B) that gave slightly weaker specificity (77%) or sensitivity (95%), respectively. Finally, a predictive score based on the combination of BCL2, CHEK1, CRY2, KDR, c-KIT, PER2, SLC26A4, TG, and TPO was established ( Supplementary  Figures S1D and S2C). It allows for the discrimination of FTC from benign with 97% sensitivity and 78% specificity. In agreement with the first predictive score (Figure 1), specificity of the predictive score was higher for the oncocytic subgroup compared to the non-oncocytic subgroup, and the highest for FTCs bearing vascular invasion. We thus conclude that the predictive score based on the combination of CHEK1, c-KIT, SLC26A4, TG and TPO allows for the most accurate prediction of FTC diagnosis (Figure 1). Although these results strongly suggest a predictive value for FTC diagnosis based on the combined assessment of 5 gene expression level changes, follow-up studies with a higher number of samples will be required to estimate the here proposed score validity.
In accordance with the gene expression-based score previously established by us for PTC, we employed the BRAF mutation status as an additional parameter for establishing the correlation with postoperative clinical evaluation [8]. However, due to the low frequency of the N-RAS61 and H-RAS61 mutation in our FTC cohort (8.9% and 1.8% respectively; Table 4), and lack of correlation between the mutation frequency and disease progression, the RAS mutational status was not taken into account for the predictive score established for FTC.

Altered transcript expression in human FTC: identification of new and confirmation of previously reported potential biomarkers by NanoString analysis
NanoString nCounter TM , a color-coded probe-based method, represents a highly sensitive approach for the quantification of gene expression. Based on direct probe hybridization, it allows for the collective assessment of a large number of transcripts within the same sample, including high precision analyses of FFPE samples, as previously reported by us and others [8,14,15].
Employing NanoString analysis, we report for the first time a strong upregulation of the essential cell cycle   , and in a number of non-thyroid malignancies by other groups [16,17]. A recent report reveals that CHEK2 (but not CHEK1) levels are altered in PDTC and ATC [18]. In addition, transcript levels of the solute carrier (SLC) family members SLC26A4 (encoding for pendrin) and SLC5A5 were significantly downregulated in samples without consideration of FTC type, and in oncocytic FTC, respectively (Table 1). Our recent study revealed that SLC26A4 has a tendency for downregulation in human PTC [8]. Interestingly, the SLC26A4 gene methylation pattern in benign adenoma was altered in thyroid carcinoma, with methylation levels being inversely correlated to the gene expression levels, suggesting that such epigenetic changes might represent a new mechanism in altering SLC26A4 gene function during thyroid carcinoma tumorigenesis [19]. SLC5A5 was previously reported to be downregulated in thyroid carcinomas by us and others [8,20]. Of note, pendrin was suggested to be a downstream target of the TTF-1/Nkx-2.1 homeodomain transcription factor in differentiated thyroid cells [21]. In good agreement with previous work [22], our current analysis reveals that thyroglobulin (TG) was significantly downregulated in FTC ( Table 1). Assuming that thyroid tissue is de-differentiating upon carcinoma development, this might be a plausible mechanism by which SLC26A4 and TG are downregulated in FTC. To further explore this link, it might be interesting to assess the expression of TTF1 and Nkx2.1 in the same human carcinoma samples in the future. With regard to the strong downregulation of c-KIT observed by us in FTC (Table  1), to the best of our knowledge such downregulation has not been previously associated with human FTC, while a similar tendency has been previously reported in PTC by us and others [8]. Finally, in an agreement with the previously established role of thyroid peroxidase (TPO) in oncogenic transformation in general, and its association with human thyroid carcinomas [23,24], our analysis has shown a strong downregulation of this transcript in all FTC samples (Table 1).
In summary, our study reveals for the first time that transcript levels of CHEK1 are strongly upregulated in human FTCs. Moreover, it further confirms downregulation of SLC26A4, SLC5A5, c-KIT, TG, and TPO in the same FTC samples, in good agreement with previous publications.

Cell cycle regulators and core-clock components in human thyroid carcinomas
There is growing evidence on the importance of biological rhythms in the pathophysiology and treatment of cancer [25][26][27][28]. Recent findings have revealed that the circadian clock and cell cycle might be linked [29][30][31][32]. Here, we show for the first time a downregulation of PER2 core-clock transcript levels in oncocytic FTC, and PDTC cases ( Table 1-2). Of note, PER2 has been previously demonstrated to play a key role as tumor suppressor, by regulating DNA damage responsive pathways [33]. The levels of another clock transcript, CRY2, were significantly downregulated in PDTC, and even further downregulated in oncocytic PDTC (Table 2), in agreement with our previous study, demonstrating downregulation of CRY2 in PTC (Table 5 [ 8]). The alterations in expression levels of PER2 and CRY2 described here in PDTCs are in line with the results of our previous study, demonstrating that the molecular characteristics of the human thyroid clock are altered in primary cultured thyrocytes derived from PDTC biopsies [34]. Furthermore, a key cell cycle   Table S2) regulator CHEK1 exhibited significant alterations in all groups of malignancies, with increasing fold changes from FTC to PDTC (Table 5). Additional cell cycle regulator CDKN1B was significantly downregulated in PDTCs (Table 2). Finally, we demonstrate that the apoptosis related gene BCL2, previously reported to be associated with a number of malignancies by other groups [16,17], exhibits a downregulation in FTC, PTC and PDTC, with a progressive increase in fold change associated with malignancy progression (Table 5, [8]). Table 2)

FTC samples (Supplementary
Taken together, these data suggest a correlation between the transcriptional changes in the levels of the circadian clock, the cell cycle key components, and the increasing the risk for oncogenic transformation and progression. Providing further insights into the molecular mechanisms that underlie the alterations in key components of the core-clock, cell cycle and apoptosis, and their roles in thyroid malignancy progression, might be of great scientific and clinical interest.

Correlation between the molecular biomarker alterations and the clinical progression of human thyroid carcinomas
Strikingly, the pattern of molecular biomarkers identified by our analyses was strongly associated with the clinical diagnostics of the FTC and PDTC subgroups (Tables 1-2). Both oncocytic FTC and PDTC groups exhibited a higher number of altered genes compared to their non-oncocytic counterparts. For instance, a key component of WNT signaling, FZD1, whose downregulation might be associated with increased growth and invasiveness of FTCs [35], was significantly decreased in oncocytic FTC and PDTC only, while it stayed unchanged in non-oncocytic samples (Tables 1-2). In addition, transcripts, which were altered in both oncocytic and non-oncocytic subgroups, exhibited consistently higher fold-changes in the oncocytic group versus non-oncocytic counterparts (compare subgroups 4-6 to 2-3 in Table 1 and subgroups 3 and 2 at Table 2). These data further support the hypothesis that oncocytic and non-oncocytic variants of human thyroid carcinomas might bear distinct molecular pattern [36]. Additionally, FTC with vascular invasion exhibited more pronounced changes of molecular markers, if compared to their counterparts without vascular invasion (Table 1), further suggesting that vascular invasion represents a hallmark of malignancy, accompanied by dramatic changes in the molecular pattern [36].
Of note, the comparative investigation of gene expression levels assessed by NanoString analyses in three major clinical groups of human thyroid carcinomas (FTC, PTC and PDTC [8]), reveals that alterations levels of several transcripts might be gradually increasing in conjunction with tumor progression (Table 5). Such tendency was observed for BCL2, CRY2, c-KIT, DIO2, FZD1, KDR, PER2, SLC26A4, TG and TPO, (Table 5). For CHEK1, however, alteration levels in PTC were lower than those observed for FTC, which might be attributed to the relatively small number of cohorts analyzed in both studies.

Towards a reliable correlation coefficient for the diagnosis of FTC: a gene expression-based predictive score
Correlation analysis of the most promising biomarkers for FTC (CHEK1, c-KIT, SLC26A4, TG and TPO) allowed for the establishment of a predictive score that discriminates between benign and FTC samples with 96% sensitivity and 82% specificity at a threshold of 0.725 (Figure 1). While this score was moderately reliable for the non-oncocytic subgroup of FTC (Figure 1), in case of oncocytic FTC only two false-negatives were observed ( Figure 1). The most reliable prediction was provided for oncocytic cases with vascular invasion, based on the subset of samples analyzed in our work ( Figure 1). Importantly, our predictive score is only indicative at this point and demands rigorous confirmation in subsequent follow-up studies. Assessment of preoperative biomarkers for thyroid carcinomas through microRNA screening [37], proteome, and lipidome analyses [38,39] have recently proven to be highly promising strategies. Thus, integrative approaches including the here established predictive score based on the combined alterations of several molecular biomarker levels, possibly in combination with biomarkers assessed by microRNA, proteomic and lipidomic profiling, might encompass a great potential towards increasing the reliability of the preoperative diagnostics for thyroid carcinomas.

Study participants and thyroid tissue sampling
FFPE samples from benign, FTC and PDTC human thyroid nodules were obtained from the archive of the Pathology Department, Geneva University Hospital.

Donor characteristics are summarized in Supplementary
Tables S1-S3. Malignant tumors were classified by histopathological analysis according to the World Health Organization Classification of Thyroid Tumors [40] and staged according to the AJCC Cancer Staging Manual 7th ed (see Supplementary Data for more details on the diagnostics). In addition, the diagnostics of PDTC cases was made using the Turin criteria [41]. Written informed consent was obtained from each patient and the study protocol was approved by the local Ethics Committee (CER 11-014).

RNA extraction from FFPE samples
RNA was extracted using the High Pure miRNA isolation kit (Roche) according to the manufacturer's instruction, as previously described by us in details [8].
Gene expression quantification using multiplexed, color-coded probe pairs (NanoString nCounterTM) 53 candidate genes were selected for analysis, based on our own previous studies [8,34], and on literature search. Several transcripts, previously demonstrated to exhibit strong expression level changes in the FTC and PDTC, such as TIMP1, c-MET and c-KIT [20,42], were included for the correlation analysis. Probes were designed and synthesized by NanoString nCounterTM technologies. 22 genes out of 53 overlapping between the three independent NanoString experiments (codesets), exhibited significant alterations in thyroid carcinoma (Supplementary Table S4) and were therefore used for subsequent analyses. 200-400 ng of total RNA, extracted from FFPE samples, were hybridized with multiplexed NanoString probes, as described in [9], from 3 independent NanoString experiments (codesets 1, 2 and 3). Background correction, codeset calibration, and statistical analysis were performed as described in Supplementary Methods.