Non-invasive tumor genotyping using radiogenomic biomarkers, a systematic review and oncology-wide pathway analysis

With targeted treatments playing an increasing role in oncology, the need arises for fast non-invasive genotyping in clinical practice. Radiogenomics is a rapidly evolving field of research aimed at identifying imaging biomarkers useful for non-invasive genotyping. Radiogenomic genotyping has the advantage that it can capture tumor heterogeneity, can be performed repeatedly for treatment monitoring, and can be performed in malignancies for which biopsy is not available. In this systematic review of 187 included articles, we compiled a database of radiogenomic associations and unraveled networks of imaging groups and gene pathways oncology-wide. Results indicated that ill-defined tumor margins and tumor heterogeneity can potentially be used as imaging biomarkers for 1p/19q codeletion in glioma, relevant for prognosis and disease profiling. In non-small cell lung cancer, FDG-PET uptake and CT-ground-glass-opacity features were associated with treatment-informing traits including EGFR-mutations and ALK-rearrangements. Oncology-wide gene pathway analysis revealed an association between contrast enhancement (imaging) and the targetable VEGF-signalling pathway. Although the need of independent validation remains a concern, radiogenomic biomarkers showed potential for prognosis prediction and targeted treatment selection. Quantitative imaging enhanced the potential of multiparametric radiogenomic models. A wealth of data has been compiled for guiding future research towards robust non-invasive genomic profiling.


INTRODUCTION
Considerable progress had been made in developing targeted therapies for genomic subtypes in cancer, but patient selection for these therapies can be challenging. Radiogenomics (sometimes imaging genomics) is a new, rapidly evolving field of research aimed at developing tools for non-invasive genotyping by identifying imaging biomarkers for genomic subtypes [1][2][3]. Radiogenomic analysis refers to the integration of radiophenotypes and genomic data in order to find radiogenomic associations ( Figure 1). Radiogenomic analysis can be performed using Review www.oncotarget.com qualitative-or quantitative (computer-extracted, radiomics) imaging features, which can be used as individual biomarkers or can be incorporated in multiparametric prediction models.
Radiogenomics yields considerable advantages for genotyping. Firstly, tumor genetic heterogeneity can be captured using radiogenomics. Biopsy-based genotyping in the clinical setting is generally confined to a single sample, although multiregional genotyping has been performed effectively to capture tumor heterogeneity [4][5][6]. Radiogenomic biomarkers have shown a great potential for capturing tumor heterogeneity non-invasively [7,8]. Secondly, a non-invasive method can be performed repeatedly, and is therefore eminently suitable for treatment follow-up. In addition, radiogenomic markers are important for tumors for which biopsy is unavailable (e.g. glioma, retinoblastoma) [9]. Finally, radiogenomics is fast and costeffective, generally using routine clinical imaging. Several non-systematic reviews were published on radiogenomics [2,3,[10][11][12][13][14][15][16][17][18][19]. The main purpose of this systematic review was to provide a comprehensive oncology-wide database of radiogenomic associations, and to review their clinical usefulness. A secondary objective was to assess radiogenomics on a pathway-level instead of a gene-level; to perform oncology-wide gene pathway analysis in order to identify relations between imaging and oncopathways.

Database of imaging-genomics associations
We included 187 articles published between July 2004 and February 2017. A PRISMA flow diagram for the inclusion process is available in the Supplementary Table 1. Figure 2 illustrates the exponential growth of publications on a year-over-year basis. The major groups reflected diffuse glioma (n = 79, 42%), non-small cell lung cancer (NSCLC) (n = 51, 27%), and breast cancer (n = 18, 10%). Often, studies used multiple modalities; 105 studies used MRI (56%), 80 CT (43%), 44 FDG-PET (24%), and 5 mammography (3%). In 59/187(32%) articles biological clarifications for imaging-genomics relations were identified. The 2440 identified radiogenomic associations in the database are presented as a pivot table, which provides an easy graphical interface to perform data queries using Microsoft Excel (2010/2013) (Supplementary Table 2). Study characteristics and quality assessment are available in the Supplementary Table 3. The results section focuses on repeatedly identified imaging-genomics associations with possible clinical application.
Quantitative CT-and PET-features could also predict ALK or ROS1/RET fusions (sens = 0.73, spec = 0.70) [105]. For development of prognostic imaging biomarkers, two groups used quantitative imaging for predicting prognosis-related gene clusters and found a lower kurtosis value linked with poorer survival [99]. Additionally, a module of tumor size, edge shape, and sharpness could predict survival [97]. Similarly, the prognostic value of PET-imaging was explained from a genomic perspective using radiogenomic analysis [100,101].

Breast cancer
This review only included studies with analyses on a genomic level; imaging-receptor associations based on  immunohistochemistry analysis were reviewed elsewhere [10]. High FDG-PET uptake was found for gene expression signatures for basal like, while low uptake was found for luminal like cases [106]. Low FDG-PET uptake was also associated with expression of oestrogen-receptor related genes [107]. Other studies associated luminal B genes with quantitative dynamic MRI-perfusion [108] and BRCAmutations with sharp margins and rim enhancement on MRI [109], but these findings were not independently validated.

Hepatocellular carcinoma
Three studies were included for HCC [148][149][150]. Tumors with ill-defined margins on CT showed high expression of a gene expression signature for doxorubicinsensitivity [150]. Additionally, targetable high VEGFexpression [151] was related to attenuation, heterogeneity and tumor margins on CT [148]. A gene signature of microvenous invasion (indicating poor prognosis) can be predicted by a CT biomarker including presence of small intratumoral internal arteries and the absence of hypodense halos [148]. A different genomic score for venous invasion was correlated with CT intratumoral arteries and margins [150].
Supplementary Table 10 summarizes associations of imaging and individual genomic features that were found in multiple cancer types. Imaging groups (N = 14) comprised www.oncotarget.com features (e.g. tumor size, multifocality) of both MRI and CT in various malignancies. Results included the correlations of enhancement features with VEGF-expression in brain tumors (glioblastoma) [158] and head-neck tumors (oral cavity SCC) [156].

DISCUSSION
This study provided a comprehensive database of imaging-genomics associations, in which queries can be made (Supplementary Table 2). This review focussed on both imaging-genomics associations with possible clinical application per cancer subtype and oncology-wide patterns in radiophenotype-genotype relations.

Diffuse glioma
The 2016 WHO classification for diffuse glioma in adults is largely based on IDH1-mutation status and 1p/19q codeletion [159]. However, biopsy-based genotyping is an invasive technique that can be unreliable due to spatial tumor heterogeneity. Imaging biomarkers reflect the whole tumor and could possibly enhance genotyping accuracy noninvasively. Compared to IDH-wild type, IDH-1/2 mutated glioma have a favourable prognosis [21][22][23]160]. IDHstatus is the top-level diagnostic stratification after histology in the WHO index 2016 [159]. Although 12 out of 15 studies identified associations between imaging and IDH-status, the majority of findings were not independently validated. MRperfusion [27, 28] and 2-HG MR-spectroscopy parameters [30,31,161], however, were correlated with IDH-status in multiple studies and yield potential for future imagingbased IDH-mutation detection. The oncometabolite 2-HG is elevated in IDH-mutated cases and can be depicted using MR-spectroscopy [29, [162][163][164], although this is technically challenging due to overlap of neighbouring metabolites (GABA, glutamate and glutamine) in the spectrum. Stateof-the-art MR systems generate the high-quality spectra needed for 2-HG detection, enabling clinical practice integration [165].  [166,167]. EGFR aberrations were often correlated with MR-perfusion parameters, possibly due to the effect of EGFR on cell invasiveness and angiogenesis. However, despite the important role of EGFR in glioma development We excluded enhancement pattern features. b We required a minimal of 20 genes of radiogenomic associations for an imaging feature group (genes from input) for inclusion in analysis. www.oncotarget.com [168], suitable EGFR-targeted therapies for glioma have not been developed [169].

Non-small cell lung cancer
Since specific therapies are available for genomic subgroups of NSCLC, genotyping is important for directing therapy [172]. However, biopsybased genotyping can cause treatment delay [173]. Radiogenomics may provide a reliable non-invasive tool for fast genotyping. EGFR-mutated [174,175] and ALKrearranged [172,176,177] tumors are targetable and are therefore extensively researched in radiogenomics. Repeatedly, FDG-PET was associated with EGFRmutations, which may be biologically explained by the activating role of mutated EGFR glycolysis through AKTsignalling [178,179]. The major studies showed a higher FDG-PET uptake for EGFR-mutated tumors; one of these validated their results in an independent cohort [73]. The three studies that found a lower uptake for EGFR-mutated cases were possibly unreliable because lower uptake was either not confirmed in multivariate analysis [76], found in metastasis only [75], or found because the comparison group had highly avid KRAS-cases [77]. Proportion GGO versus solid appearance on CT might be useful to differentiate genetic NSCLC-subtypes. Seemingly, wild type tumors have a large proportion GGO, EGFRmutated tumors have a small component GGO, and ALKrearranged tumors are the most solid. However, validation studies with standardised GGO measurements are needed to reliably discriminate genotypes. Similarly, standardised tumor morphology features need to be assessed in order to validate the predictive value of ill-defined tumor borders for ALK-status. Multiparametric (quantitative imaging) studies can be powerful for predicting individual genetic traits, as well as gene clusters related to prognosis. However, findings need to be validated in independent cohorts before they can be used in clinical practice.

Breast cancer
In breast cancer, most radiogenomic associations were not independently validated. Limited results indicate FDG-PET can possibly discriminate molecular subtypes [106,107]. Gene-expression scores such as Oncotype Dx recurrence risk test and MammaPrint metastasis risk test become increasingly important for clinical decision making in breast cancer, especially to prevent unnecessary chemotherapy. Since genetic tests are costly and timeconsuming, studies aimed at finding imaging surrogates. In multiple studies perfusion features showed potential for predicting high-risk genetic tests, indicating tumor perfusion may be sign of poor prognosis in breast cancer. Studies furthermore indicated the potential of perfusion imaging for predicting gene expression markers for anti-VEGF treatment response. However, clinically applicable models are yet to be established.

Colorectal carcinoma, renal cell carcinoma, hepatocellular carcinoma
In colorectal carcinoma, KRAS-mutation indicates irresponsiveness to EGFR-targeted treatment [180,181] and showed high FDG-PET uptake in multiple studies. The lack of this association in one study [134] and a reported low accuracy for prediction [129] might be explained by falsepositive high uptake due to inflammation [136]. Although findings are not yet prospectively validated, FDG-PET has great potential for providing biomarkers for EGFRtreatment decision making in CRC. In renal cell carcinoma, the amount of calcifications shows potential for predicting BAP1-status, which could be useful for assessing stage, grade and invasiveness [146,147]. However, findings need validation. Although multiparametric modelling studies in RCC were limited, great strides are put in assessing its application for predicting prognosis-and complication risk [142,143,145]. For hepatocellular carcinoma, radiogenomic biomarkers could aid both treatment selection (VEGF-targeted and doxorubicin treatment) as well as prognosis prediction. Microscopic venous invasion, a sign of poor prognosis and high recurrence risk, was associated with small intratumoral arteries (CT), which was independently validated [182]. However, it was noticed patients were not selected indiscriminately [183]. The amount of studies and their population size were too low to draw conclusions.

Patterns in radiogenomic associations
Repeatedly found imaging-genomics associations show patterns among different neoplasms. A convincing relation was found for enhancement on imaging and VEGFexpression, identified in brain and head neck cancers. The same association was found in a study (not included) assessing radiogenetics using immunohistochemistry in HCC [184]. For a more profound understanding of radiogenomic relations and underlying regulatory networks, insights into the related biological process can be of considerable value. Angiogenesis was the most mentioned biologic link between imaging and genomics in glioblastoma [48, 54,55,57,66,67,158], oligodendroglioma [56,185], breast cancer [102,126], oral cavity SCC [156], and RCC [138]. Angiogenesis-related genes such as VEGF and EGFR genes were compared with angiogenesis related imaging features such as perfusion and contrast enhancement. Similarly in gene pathway analysis, angiogenesis (biology) may be the link between enhancement (imaging) and VEGF-pathway-signalling (genomics).

Oncology-wide gene pathway analysis of radiogenomic associations
The importance of targeting multiple regulators in cancer pathways instead of single genes, is increasingly recognized. Genes associated with enhancement were enriched for the VEGF-signalling pathway. Similar to the association between contrast enhancement and the VEGF gene, enhancement may be associated with the VEGF signalling pathway due to its regulating role in angiogenesis. Imaging biomarkers for the VEGF pathway may have clinical implications as they could aid patient selection for VEGF-targeted treatment. VEGF-targeted therapy has been shown to be effective in various cancer types, including CRC, NSCLC, and breast cancer [186,187].
Gene pathway analysis results indicated furthermore that contrast enhancement and necrosis detected with imaging reflect MAPK-and PI3K-Akt-mTOR-activity. Similarly, this could again aid patient selection in the future for MAPK-and PI3K-AKT-mTOR pathways-based targeted therapies [188][189][190]. An important limitation for the gene pathway analysis was, nevertheless, the heterogeneity of the imaging features within the particular imaging feature groups. To minimize this effect, specific subgroups were created such as "enhancement patterns" and "amount of enhancement". Another important limitation for this analysis was that different types of genomics information (e.g. gene mutation vs gene expression) were described in literature. In addition to that, alterations can in principle result in either the activation or repression of the involved gene. In the performed analysis, the direction of the change, activation versus repression, could not be taken into account. Therefore, this analysis could only reveal associations.
Although evidently standardised imaging features and genetic tests are needed for further validation, results of this gene pathway analysis do reveal that oncologywide associations between imaging groups and oncology pathways with potentially clinical value may exist. Our www.oncotarget.com findings do not only indicate radiophenotype-genotype associations could be similar in different cancer types, but also imply that radiogenomics could aid patient selection and monitoring of pathway-targeted treatment in the future.

Radiogenomics techniques
Different approaches were seen for conducting radiogenomics analysis. A considerable disadvantage of qualitative imaging assessment is the poor interobserver agreement. A powerful tool to overcome this is a validated feature set, such as VASARI [191] for glioma imaging [58,60,64,192]. The rapidly rising capacity of quantitatively computer-extracted imaging (perfusion-, diffusion-and texture features) enables more powerful and robust prediction of genomic traits. This radiogenomic approach has proven to be powerful for prognosis-prediction [58-64, 97, 99,142], and for revealing differential pathway-activity [60, 148,193,194]. The trend in radiogenomics is increasingly headed towards models of multiparametric multilevel (clinical, radiological and histopathological) data, unravelling radiogenomic networks [58,63,64,105,116]. Methodologically, however, the use of quantitative imaging is still developing. Reproducibility of quantitative parameters is a major concern, since they are highly dependent on scanner systems and software packages. Particularly in MRI, it remains challenging, as it has less standardised quantitative values compared with CT-or PET-imaging. Moreover, overfitting of data models can be an issue. Standardised datasets such as The Cancer Imaging Archive (TCIA) [195] and The Cancer Genome Atlas (TGCA) [196] can provide a solution for validation in an independent cohort. Standardization of methods and prospective validation are needed before quantitative radiogenomics can be treatment informative.

Limitations
A limitation of this study was the marked heterogeneity of genomic and imaging features and the variety of analysing methods which made data integration challenging. Another constraint was that the effect size and the direction of associations were not always reported. There might have been publication bias for significant findings, but the novelty of this field of research reduces this risk. A limitation was that data of included multiparametric modelling studies were usually not published online, so these p-values could not be incorporated in the database.

Potential of radiogenomics
Radiogenomic genotyping has the advantage that it can capture tumor heterogeneity, can be performed repeatedly for treatment monitoring, and can be performed in malignancies for which biopsy is not available. Moreover, radiogenomics is cost-effective using routine clinical imaging for analysis. The gene pathway analysis in this study revealed imaginggenomic networks in oncology and indicated that radiogenomics may be suitable for predicting efficacy of pathway-targeted therapies. Although an extensive amount of potentially valuable radiogenomic biomarkers was identified, validation studies are needed since the robustness of features obtained by different scanners remains an important concern. This study provides an extensive database of imaging-genomic associations that can guide future research to developing radiogenomic tools for treatment selection and prognosis prediction in human oncology. Radiogenomics, connecting multiparametric quantitative imaging with genomic data, yields great potential for non-invasive genotyping, thereby contributing to the shift towards precision medicine in oncology.

METHODS
We performed this study according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [197].

Search strategy and article selection
We systematically searched the Medline and Embase databases for English literature published until 1-2-2017 on radiogenomics in oncology with search terms referring to radiogenomics and oncology (Supplementary Table 1). References of included articles and literature reviews were checked for additional eligible studies. The following inclusion criteria were adopted: (1) the population consisted of human cancer patients; (2) the article comprised statistically assessed associations between imaging features on CT, MRI, FDG-PET or mammography and genomics; and (3) full-text was available in English. We excluded studies performing radiogenetics using immunohistochemistry analysis. We excluded case reports, editorial letters, and reviews.

Extraction of study characteristics, quality checklist
Study characteristics, quality assessment, p-values for associations and effect measures were incorporated in a database (Supplementary Table 2). P-values of studies using an extensive amount of quantitative imaging features or multiparametric models were reviewed separately. For quality assessment, the QUADAS-2 checklist [198] was used, with additional items to address radiogenomics specifically, including the availability of an independent validation cohort. All data generated or analysed during this study are included in this published article (and its Supplementary Information files). www.oncotarget.com

Oncology-wide gene pathway analysis
Gene pathway analysis was performed to examine concordance between grouped radiophenotypes oncologywide and gene pathways. For this analysis, imaging features were classified for 14 coherent imaging groups. Significant radiogenomic associations for each imaging group were selected. Only single genes were selected (e.g. no chromosome-type aberrations), and the genes were annotated according to the HUGO Gene Nomenclature Committee (HGNC) nomenclature, regardless of neoplasm location or type of genetic information (DNA mutation, gene expression (mRNA), methylation status). A minimum of 20 genes per imaging feature group was required for inclusion in the analysis. Gene pathway analysis was performed by comparing significantly associated genes within a particular imaging group with already existing functional gene pathway annotations; the cancer gene pathways in the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database [193,199]. The ToppGene Suite software (Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinatti, OH; https://toppgene. cchmc.org/)) was used for gene pathway analysis based on functional annotation, calculating p-values using the hypergeometric probability mass function. method. P-values were corrected for multiple testing using the Bonferroni method (cutoff value 0.05).

CONFLICTS OF INTEREST
The authors declare no conflicts of interests.

FUNDING
There was no funding for this study.