Robust detection of immune transcripts in FFPE samples using targeted RNA sequencing

Current criteria for identifying cancer patients suitable for immunotherapy with immune checkpoint blockers (ICBs) are subjective and prone to misinterpretation, as they mainly rely on the visual assessment of CD274 (best known as PD-L1) expression levels by immunohistochemistry (IHC). To address this issue, we developed a RNA sequencing (RNAseq)-based approach that specifically measures the abundance of immune transcripts in formalin-fixed paraffin embedded (FFPE) specimens. Besides exhibiting superior sensitivity as compared to whole transcriptome RNAseq, our assay requires little starting material, implying that it is compatible with RNA degradation normally caused by formalin. Here, we demonstrate that a targeted RNAseq panel reliably profiles mRNA expression levels in FFPE samples from a cohort of ovarian carcinoma patients. The expression profile of immune transcripts as measured by targeted RNAseq in FFPE versus freshly frozen (FF) samples from the same tumor was highly concordant, in spite of the RNA quality issues associated with formalin fixation. Moreover, the results of targeted RNAseq on FFPE specimens exhibited a robust correlation with mRNA expression levels as measured on the same samples by quantitative RT-PCR, as well as with protein abundance as determined by IHC. These findings demonstrate that RNAseq profiling on archival FFPE tissues can be used reliably in studies assessing the efficacy of cancer immunotherapy.


INTRODUCTION
During the past few years, no less than four distinct monoclonal antibodies (mAbs) that interrupt immunological checkpoints, so-called immune checkpoint blockers (ICBs) have been approved by the US Food and Drug Administration (FDA) for use in cancer patients as standalone immunotherapeutic regimens or combined with other drugs [1]. These agents include the cytotoxic T lymphocyte-associated protein 4 (CTLA4)-specific mAb ipilimumab (Yervoy), which is licensed for the treatment of melanoma [2][3][4]; two mAbs targeting programmed cell death 1 (PCDC1; best known as PD-1), namely nivolumab (Opdivo) and pembrolizumab (Keytruda), which are approved for use in patients with melanoma, head and neck squamous cell carcinoma, non-small cell lung carcinoma (NSCLC) and Hodgkin's lymphoma [5][6][7][8][9][10][11][12][13][14]; and atezolizumab (Tecentriq), a mAb targeting CD274 (best known as PD-L1) recently approved for use in bladder carcinoma patients [15,16]. ICBs can mediate robust clinical effects as they release immune effectors from cancer-driven immunosuppression, hence activating novel or reactivating existing tumor-targeting immune responses [17]. However, only a limited fraction of patients (generally <30%) benefit from ICBs as standalone immunotherapeutic agents [1]. Moreover, it is estimated that the total market for ICBs may reach 7 billion US by 2020 [18]. Thus, there are both clinical and economical challenges associated with the growing use of ICBs, calling for the development of cost-effective and reliable selection procedures.
Currently, the immunohistochemical assessment of PD-L1 expression level on formalin-fixed paraffinembedded (FFPE) specimens is the only test employed in the clinic to guide the use of ICBs, and is an approved companion diagnostic for NSCLC patients considered for pembrolizumab treatment [19,20]. Other potential indicators of response to checkpoint inhibition include mutational burden [21][22][23] and tumor-infiltrating lymphocyte (TIL) abundance [24][25][26], but both lack sensitivity and specificity as single biomarkers. An alternative approach to predicting clinical response to ICBs involves digital gene expression analysis by RNA next generation sequencing (NGS). While this methodology works well on fresh frozen (FF) samples, it has demonstrated suboptimal performance on more readily available archival FFPE specimens [27,28]. Thus, there is no test available today to reliably predict whether a cancer patient will respond to immunotherapy with ICBs.
To begin to address this major gap in clinical care, we applied targeted amplicon-based RNA sequencing (RNAseq) to a panel of 395 transcripts related to T-cell receptor signaling (TCRS), tumor infiltration by immune cells, and other immunological functions that are key for anticancer immunosurveillance. RNAseq is particularly adept at detecting poorly represented transcripts. We optimized our assay for RNA isolated from FFPE samples, which suffer from RNA degradation as a result of formalin   fixation. We are currently focusing on a subset of these genes to generate a focused RNAseq panel (which we named Immune Advance) that predict clinical response to ICBs.
Here, we present data demonstrating that the Immune Advance assay on FFPE samples is associated with a low failure rate and produces gene expression profiles that are highly concordant with those obtained on FF specimens. Moreover, we report that the expression levels of three prototypic biomarkers, namely CTAG1B (a tumor-associated antigen best known as NY-ESO-1) [29], CD8 (a biomarker of cytotoxic T lymphocytes) [30,31], and PD-L1 (see above) [32], measured by RNAseq, quantitative RT-PCR (qRT-PCR) and immunohistochemistry (IHC) exhibit high levels of correlation. Thus, the Immune Advance assay can accurately profile gene expression in FFPE samples as an instrument to predict clinical response to ICBs.

Reproducibility of the immune advance assay
RNA was extracted on matched FFPE and FF sections from 14 ovarian cancer specimens ( Figure 1A), and analyzed across three RNAseq runs. The average mean read length was 112 bp and the percentage of aligned bases was 96% (Supplementary Table S1). All samples had a minimum mapped reads of 2,554,065 with the exception of one sample, 14-FFPE, with 2,354 mapped reads (Supplementary Table S2), which failed our internal quality control. On average, we achieved 4,432,474 mapped reads (after excluding sample 14-FFPE), which represents a sufficient depth for digital gene expression profiling of 395 genes. Likewise, 89.55% mapped reads were on target, which is consistent with best RNAseq practice [33]. Overall 27/28 samples passed quality control, indicating 100% and 93% assay robustness for FF and FFPE samples, respectively. Normalized reads per million (nRPM) values derived from each FFPE/FF sample pair were correlated using the Pearson method. The mean, minimum, and maximum Pearson correlation coefficients obtained from 13 paired FFPE/FF samples were 0.920, 0.837, and 0.969, respectively ( Figure 1B and Supplementary Figure S1).

Clinical validity
To obtain insights into the potential clinical application of the Immune Advance assay we compared nRPM values (as obtained by targeted RNAseq) with ΔCt values (as obtained by qRT-PCR) for NY-ESO-1, CD8 and PD-L1, finding robust correlation coefficients of 0.9402 (p < 0.0001), 0.9063 (p < 0.0001) and 0.9132 (p < 0.0001), respectively (Figures 2-4). The immunohistochemical assessment of NY-ESO-1 levels identified 2/14 (14%) positive samples, with sample #2 expressing NY-ESO-1 in 5% of neoplastic cells, and sample #7 in its totality. nRPM values also highlighted a similar binary distribution of positive versus negative samples, and RNAseq results correlated with both qRT-PCR and IHC findings, although the analysis was limited by the presence of only two positive specimens (Figure 2A-2C).
Finally, four samples exhibited some degree of membranous PD-L1 staining in 5%-100% neoplastic cells. These specimens (namely, samples #5, #6, #8 and #10) also exhibited high nRPM values. Interestingly, the specimen with the highest amount of PD-L1-encoding RNA as per the Immune Advance assay, namely sample #10, only contained 35% PD-L1 + neoplastic cells. It is important to recognize that sample #10 would be considered a negative result according to the HC223 pharmDx scoring guidelines. Log 2 -transformed nRPM and 1/ΔCt values for PD-L1 exhibited robust linear correlation, implying that the expression of this clinicallyrelevant biomarker is not binary like that of NY-ESO-1, but rather continuous such as that of CD8 ( Figure 4A-4C).

DISCUSSION
To assess the usefulness of RNAseq in profiling a panel of immune transcripts on FFPE versus FF samples, we utilized a stringent quality control process for tumor heterogeneity that is unique to our study and to the best of our knowledge has never been applied before [34][35][36][37][38]. Tumors were halved as FF and FFPE mirror images with the cut section of each half used for analysis. Basic quality control factors for RNAseq such as mapped reads were consistent for FFPE and FF samples. Paired samples correlated with coefficients ranging from 0.837 to 0.969. Moreover, as proof of principle, results from RNAseq and qRT-PCR were highly concordant for NY-ESO-1, CD8, and PD-L1. These findings establish the feasibility of measuring a panel of immune transcripts by RNAseq on FFPE samples to develop a biomarker that predict clinical response to ICBs.
Our study also demonstrates the usefulness of a targeted RNAseq panel to replace immunohistochemical markers including CD8 for cytotoxic T lymphocytes, and PD-L1 as predictors of response to ICBs. While there have been numerous publications on the "immunoscore" as a prognostic biomarker for cancer patients [39][40][41], several obstacles prevent implementation of an IHC-based approach into clinical routine [42]. Moreover, there is disagreement on the definition of a "high" versus a "low" TIL score in absolute terms, as most publications refer to one or a few private or public patient cohorts wherein stratification is based on median values [30,31]. Despite a small sample size in our study, our results support the notion that CD8 + T cells can be quantified and their number linearly correlates with CD8 mRNA expression, as determined by RNAseq. Moreover, our RNAseq data support the notion that PD-L1 expression is continuous rather than binary, in contrast with IHC results. Whether there is a post-translational mechanism that operates to control PD-L1 exposure in a binary manner remains to be determined. If this is not the case, the test currently employed in the clinic to measure PD-L1 expression may lack sensitivity and accuracy, which would have a negative impact on patient selection of immunotherapy with ICBs.
Of note, some tumors that stain positively for PD-L1 by IHC (including melanoma, NSCLC, renal cell carcinoma, colorectal carcinoma, and castration-resistant prostate cancer) are insensitive to ICBs, suggesting that PD-L1 alone is not a reliable predictor of clinical response [43]. Simultaneously analyzing multiple immunological biomarkers with RNAseq could improve this situation and allow for the identification of a gene signature that reliably identifies patients who will respond to immunotherapy with ICBs. In this study, we were not able to correlate RNAseq data with clinical outcome, owing to the type of specimens we employed (ovarian cancer patients do not receive ICBs as part of the clinical routine). Moreover, our study involved a limited number of patients affected by a single type of tumor, calling for validation experiments in larger and more heterogeneous patient cohorts. Irrespective of these caveats, we did identify a subset of immune transcripts that were co-expressed with PD-L1, and we are evaluating these potential biomarkers in cohorts of melanoma, NSCLC and renal carcinoma patients who receive ICBs as part of their treatment. The road to predicting clinical response to ICBs appears to be more complex than the assessment of a single biomarker like PD-L1. Further experimental and clinical validation of the Immune Advance assay is underway to obtain a robust method for simultaneously measuring the expression of multiple immune transcripts from single FFPE samples. We surmise this may pave the way to improve patient selection for immunotherapy with ICBs. www.impactjournals.com/oncotarget

Patients and specimens
All patients referred to in this report were diagnosed with ovarian cancer and treated at Roswell Park Cancer Institute (RPCI, Buffalo, NY, US). The RPCI institutional board gave explicit approval to the study, and all samples were obtained upon informed consent under an institutional protocol for tissue collection. To control for tumor heterogeneity in an effort to minimize biological variability, freshly procured remnant tissue was sectioned into two approximate halves, one of which was processed as FF in Optimal Cutting Temperature compound, and the other one fixed in formalin and processed as per standard clinical practices. Each half was marked with ink across the surface to maintain original tissue orientation and mounted on slides faced side up (Figure 1). Sections from the mirroring surfaces of both FF and FFPE blocks were cut and stained with hematoxylin and eosin quality review by a qualified pathologist. Additional serial sections were cut for RNA extraction and IHC. A total of 14 matched FFPE/FF pairs corresponding to 28 samples were collected.

RNA extraction
RNA was extracted from FFPE tissues using the truXTRAC™ FFPE RNA Kit (Covaris), as per manufacturer's instructions with modifications. Briefly, lysates from partially lysed tissue samples were processed immediately for RNA extraction. The truXTRAC™ FFPE RNA Kit is designed for use with the Adaptive Focused Acoustics AFA™ process. Standard de-crosslinking and column purification steps were performed to remove proteins and other cellular components prior to RNA elution in water. RNA was extracted from FF tissues using the AllPrep DNA/RNA Mini Kit (Qiagen), as per manufacturer's instructions. RNA was quantified by means of the Qubit RNA HS Assay Kit (Thermo Fisher Scientific).

RNAseq library preparation, quantification, pooling and sequencing
Oncomine™ Immune Response Research Assay libraries were prepared using the Ion AmpliSeq™ targeted sequencing technology (Thermo Fisher Scientific), as per manufacturer's instructions. The Assay is a 395 gene panel focused on diverse immunological processes including TCRS, tumor infiltration by immune cells, and other key immune functions (Supplementary Table S3). Briefly, 10 ng RNA was reverse transcribed into cDNA (25 °C, 10 min; 42 °C, 60 min; 85 °C, 5 min; 4 °C, hold) and targets were amplified (99 °C, 2 min; 99 °C, 15 seconds, 60 °C, 4 min, 19X; 10 °C, hold) with a multiplex immune response primer pool targeting 395 genes. Amplicons were partially digested using the FuPa Reagent (50 °C, 10 min; 55 °C, 10 min; 60 °C, 20 min; 10 °C, hold for up to one hour). Barcode adapters were ligated to partially digested amplicons (22 °C, 30 min; 72 °C, 10 min; 10 °C, hold for up to one hour) and purified. Libraries were quantified using the Ion Library Quantification Kit (Applied Biosystems by Life Technologies), as per manufacturer's instructions. Up to 20 libraries normalized to 50pM were pooled in equal molar amounts prior to enrichment and template preparation using the Ion Chef™ system (Thermo Fisher Scientific). 200-bp sequencing was performed on the Ion Proton™ P1v3 chip (Thermo Fisher Scientific) to obtain 2-3M reads per sample. Absolute digital gene expression counts and nRPM values were generated using the Torrent Suite software (v5.0.2) and the immuneResponseRNA plugin (both from Thermo Fisher Scientific).

Gene expression normalization
A baseline expression profile for 10 endogenous control genes was established based on average RPM counts from the internal control sample NA12878 across eleven sequencing runs. Following determination of baseline expression levels, test samples were normalized based on the formula f(i) = x(i) / p(i), in which the i-th endogenous control represents the fold change f(i) of the raw read count x(i) over the above-mentioned baseline profile p(i). The median of fold changes from all these controls was then determined as F = median (f(i) | I = 1,…10). This value was further used to normalize RPM counts for all genes in the sample according to the formula x' (i) = x(i) / F, where x(i) is the raw read count of the i-th gene and x' (i) is the normalized expression (nRPM) value to be used for downstream analysis. Finally, nRPM values were log 2 -transformed.

Immunohistochemistry
A 5μm thick whole section from each FFPE sample was stained with antibodies specific for NY-ESO-1 (E978, Santa Cruz Biotechnology), PD-L1 (22C3 pharmDx, Dako), and CD8 (C8/144B, Dako), according to standard procedures. NY-ESO-1 and PD-L1 expression was evaluated by a board-certified pathologist who interpreted the staining as positive or negative. For NY-ESO-1, a positive sample was defined by moderate to strong cytoplasmic staining with membranous accentuation that is distinct from background in at least 5% of neoplastic cells, while a negative sample was defined by staining in <5% of neoplastic cells. For PD-L1, a positive sample was defined as per FDA-approved guidelines as partial or complete cell membrane staining (≥ 1+) in ≥ 50% of viable tumor cells, while a negative samples was defined by any membranous staining in less than 50% of neoplastic cells. www.impactjournals.com/oncotarget CD8 + T lymphocytes were stained and scored using the Aperio Scanscope (Aperio Technologies, Inc.), based on 20X bright-field optical microscopy. Images were analyzed using Spectrum (Aperio Technologies, Inc.) and the number of CD8 + T lymphocytes per square millimeter was counted.

Statistical analysis
Correlation coefficients were calculated according to the Pearson method. p values < 0.05 were considered statistically significant. All statistical analyses were conducted on Prism 7 (GraphPad Software).