Analyses of selected safety endpoints in phase 1 and late-phase clinical trials of anti-PD-1 and PD-L1 inhibitors: prediction of immune-related toxicities

Purpose Anti-PD1 and PD-L1 antibodies are associated with immune-related adverse effects (irAEs). This analysis aims to assess the discrepancies between frequencies of irAEs observed in phase 1 trials with those seen in late-phase trials and to evolve the field of drug development. Methods PubMed search was conducted for articles published until December of 2016. Trials needed to have at least one of the study arms consisting of nivolumab, pembrolizumab or atezolizumab monotherapy. Trials were matched based on compound used and similarity of populations. All toxicities were reported as frequencies and percentages. P-values to assess differences between matches and non-matches of phase 1 and late-phase trials and between early and late-phase trials themselves were obtained via Fisher's exact test. Odds ratios were obtained via logistic regression. Results Our search yielded 15 late-phase and 10 matching phase 1 trials; n = 4823 and n = 1650, respectively. The most common AEs seen in phase 1 trials were also observed in late-phase trials except for phase 1 trials (median n = 118) with < 118 patients (P = 0.048). Rash, pruritus, and diarrhea were the most frequently irAEs reported. Only colitis was more frequent in late-phase studies (P = 0.045). Conclusion Toxicities of anti-PD-1 and PD-L1 observed in phase 1 trials and late-phase trials are similar. There is positive correlation between phase 1 trial sample size and concordance of toxicity frequencies seen in late-phase studies. In conclusion, current immunotherapy phase 1 trials are appropriate in assessing safety profile of anti-PD-1 and PD-L1 antibodies.


INTRODUCTION
In the recent years numerous clinical trials have demonstrated the efficacy of anti-PD-1 and PD-L1 antibodies leading to FDA approval of nivolumab for patients with advanced melanoma, renal cell carcinoma (RCC), and non-small cell carcinoma (NSCLC) alongside pembrolizumab for treatment of advanced melanoma and non-small cell carcinoma NSCLC, and atezolizumab for urothelial carcinoma (UC) [1][2][3][4][5][6]. The efficacy of these drugs relies on enhancing anti-tumor immunity through the inhibition of negative regulatory signaling in T cells. The inhibition of immune checkpoint receptors disrupts immune tolerance resulting in enhanced immune activation in normal tissues leading to potentially significant toxicities [7]. Indeed, these agents have distinct toxicity profile compared to other therapies (i.e., chemotherapy and targeted therapies) characterized by immune-related adverse events (irAE) that are rare (absolute risk of allgrade irAE of ~ 10%) but when present can result in lifethreatening events [8][9][10]. irAEs affect a wide range of organs including endocrine organs, thyroid, adrenal and pituitary glands, skin, gastrointestinal tract, lung, kidney, liver, pancreas, and the nervous system [11].
As an inheritance from chemotherapy drug development methodology, conventional primary endpoints of phase 1 trials are: definition of maximum tolerated dose (MTD), recommended phase 2 dose (RP2D), and estimation of safety profile of new drugs to help guide later phase clinical trials (i.e., phase 2 and 3 trials) [12]. The underlying premise was of an existing relationship between dose-efficacy and dose-toxicity. Dose intensity was associated with anti-tumor efficacy [13,14]. This paradigm is challenged by the observation of distinct late toxicities of newer targeted agents emerging from prolonged use of drugs under development as opposed to more traditional cytotoxic therapies associated with mostly acute toxicities [15]. Furthermore, checkpoint antibody (i.e., anti-PD1 and PD-L1 inhibitors) treatment has been designed to be administered over an extended period and traditional definition of MTD and dose limiting toxicity (DLT) may not allow for prediction of toxicities in later phase clinical trials. The goal of this analysis is to investigate the correlation between frequencies of adverse events observed in phase 1 trials of anti-PD-1 and PD-L1 monoclonal antibodies (i.e., pembrolizumab, nivolumab, and atezolizumab) with those seen in late-phase trials among patients with solid tumors with particular attention to irAEs.

Study inclusion and characteristics
Initially our strategy yielded 1057 publications through our PubMed search. After screening of the study titles and abstracts 972 studies were excluded. After text review, 70 more studies were excluded for not meeting the inclusion criteria ( Figure 1). Fifteen late-phase trials met the inclusion criteria and data were extracted. These studies comprised 10 phase 3, 4 phase 2 randomized studies, and one phase 2 nonrandomized trial. The latter was included into the analysis as it was the basis for approval of atezolizumab for the treatment  Table 1). A total of 10 matchingphase 1 trials were included into our analysis. The number of dose levels ranged from 1 to 8. Three trials enrolled patients with different solid tumor histologies. Five trials studied nivolumab, 3 pembrolizumab, and 2 atezolizumab (Supplementary Table 2).

Concordance between total number of AEs, grade 3/4, grade 5 AEs between phase 1 and latephase trials
Death was a rare event among patients treated with checkpoint inhibitors. A total of 5 patients out of 1650 (0.03%) patients treated in phase 1 trial died as consequence of treatment-related AEs (all 5 were due to pneumonitis) compared to 12 out of 4823 (0.24%) in late-phase trials (i.e., encephalitis, myocardial infarction, sepsis, neutropenia, hypercalcemia one each; two due to pneumonia, four due to pneumonitis, and one due to unknown non-immune mediated causes; the odds of treatment-related deaths were similar comparing different clinical trial groups (OR = 1.2, 95% CI 0.6-2.4; P = 0.59). Grade 3 and 4 AEs were documented in 12% and 14% of the patients treated in phase 1 and late-phase studies, respectively (OR = 1.05, 95% CI 1.0-1.1; P = 0.052). Lastly, 69% of patients treated in the phase 1 trials group experienced an AE compared with 71% for the patients treated in the late-phase clinical trials (OR = 1.01, 95% CI 1.0-1.1; P = 0.04). These results suggest that phase 1 trials can reliably predict overall toxicities in late-phase studies.

Concordance between irAEs in phase 1 and latephase trials
The most commonly reported treatment-related irAEs reported in phase 1 trials were rash, pruritus, diarrhea, pneumonitis, and thyroid dysfunction (Table 3). Rash, pruritus, and diarrhea were the most $ Defined as at least 50% (3/4) of the 4 most common AEs in phase 1 seen among most common AEs in later phase trials.
Data was reported as frequencies and percentages at the study level. P-values were obtained via Fisher's exact test. Odds ratios were obtained via logistic regression.
common irAEs documented in both phase 1 and latephase trials. Nine other immune-rated AEs occurred in similar frequencies in phase 1 and late-phase trials. At the trial level analysis, colitis was observed more frequently in late-phase trials compared to phase 1 trials (66.7% vs. 10%; OR=18; 95% CI 1.8-185; P = 0.01). Similarly at the patient-level analysis, all-grade colitis was reported at low frequencies in both phase 1 and late-phase studies but tended to be more frequent among the latter studies (0.12% vs. 0.85%; OR = 3.0, 95% CI 1.02-9.0; P = 0.045). There was higher frequency of hypophysitis, and adrenal insufficiency in late-phase trials but these differences did not reach statistical significance (i.e., 0.18% vs. 0.24% 0% vs. 0.12% in phase 1 and late-phase trials, respectively). All-grade pneumonitis and hypothyroidism were reported at high frequencies in both phase 1 and late-phase trials (70% vs. 86.7% and 70% vs. 73.3%, respectively) ( Table 3).
In summary, frequencies of irAEs were seen at similar rates in both phase 1 and late-phase studies expect for colitis.

Concordance between most common AEs seen in phase 1 trials with AEs seen in late-phase trials
In the 10 phase 1 trials, fatigue, rash, pruritus and diarrhea were the most commonly reported allgrade AEs. In six out of ten matched-phase 1 trials the most common all-grade AEs were concordant with the most common AEs documented in the late-phase trials. Stratified analyses based on the following phase 1 clinical trial characteristics failed to show significant correlations between trial characteristic and the odds of concordance between frequency of most common AEs:

Concordance between most common AEs seen in late-phase trials with AEs seen in phase 1 trials
In the 15 late-phase studies, fatigue, nausea, decreased appetite, and pruritus were the most commonly reported all-grade AEs. In 53% (8/15) late-phase trials the most common all-grade AEs were also observed at high frequencies in the matched phase-1 trials. By contrast in 47% (7/15) late-phase trials there was no concordance between the most common AEs and matched phase 1 trials. None of the late-phase trial characteristics showed significant correlation with the frequency of most common For patient-level analysis, odds ratios were obtained via logistic regression. P-values and odds ratios were obtained via logistic regression.

DISCUSSION
The value of phase 1 trials in predicting adverse events and accurately defining drug-related toxicity profile in the field of targeted therapies and chemotherapy drug development has been well documented [16]. In the recent years, anti-PD-1 and PD-L1 directed monoclonal antibodies gained unprecedented momentum in the field of cancer immunotherapy as illustrated by having three of those agents now approved for treatment of solid tumors (i.e., nivolumab, pembrolizumab, and atezolizumab). We conducted a systematic review to refine the understanding as to how well can phase 1 trials predict toxicities of immune checkpoint inhibitors in late-phase trials (phase 2 and 3). Fifteen late-phase studies (i.e., 14 randomized trials and 1 non-randomized phase 2 trial) met the inclusion criteria of this analysis, including 8 trials assessing the efficacy of nivolumab, 4 of pembrolizumab, and 3 of atezolizumab. These trials were matched with 10 phase 1 trials that supported advancement of these agents into corresponding late-phase trials. Matched phase 1 trials had 30 to 495 study subjects with median distribution of 118 participants. The elevated sample size observed among the 10 phase 1 trials analyzed represents presumed anticipated early efficacy signal of anti-PD1 and PDL-1 agents for the treatment of solid tumors leading to rapid transition to large expansion cohorts in these studies. Most matched phase 1 and late-phase trials showed similar frequencies of all-grade AEs and the most frequently AE reported in both phase 1 and late-phase studies besides fatigue affected the gastrointestinal tract and the skin (Supplementary  Tables 1 and 2). This was not surprising as most common all-grade AEs related to checkpoint inhibitors have a peak frequency within less than 3 months of treatment initiation (i.e., diarrhea, rash, pruritus, etc.) [7]. By contrast, only in one out five studies with sample size below the median (n = 118) were the most common all-grade AEs observed among the most common toxicities in matched-phase 3 trials (P = 0.048), indicating the limitation of small phase 1 trials to detect AEs ( Table 1). The large sample size of expansion cohorts observed in the phase 1 trials should also be noted. As expected, there was concordance between the frequencies of most common AEs observed among the 15 late-phase studies and the phase 1 studies in the majority of matched trials (Table 2). In our study, a total of 4823 and 1650 patients were evaluable for toxicity in all 15 late-phase and phase 1 studies, respectively. Indeed, allgrade rash, pruritus, diarrhea, pneumonitis and thyroid disturbances were the most frequent irAEs in both in early and late-phase studies (Table 3). Both at the trial and patient level analyses all-grade colitis was reported more frequently in late phase trials (P = 0.01 and P = 0.045, respectively) ( Table 3). In general treatment with anti-PD-1 and PDL-1 inhibitors was well tolerated among patients with metastatic solid tumors. A total of 5 and 12 treatmentrelated deaths occurred in the phase 1 and late-phase studies, respectively (OR = 1.2, P = 0.59). Most of these deaths were secondary to lung toxicities (11 out of 17). Grade 3 and 4 AEs happened in 12% and 14% of the phase 1 and late-phase trials, respectively (OR = 1.05, P = 0.052), indicating similar and low frequencies of these events in early and late-phase trials. However notwithstanding similar frequencies of serious AEs leading to death the spectrum of grade 5 toxicities among patients treated with late-phase studies were much wider compared to phase 1 trials in which all 5 deaths were due to pneumonitis. Physicians and drug developers should be aware of and surveil for these potentially lethal toxicities. Nonetheless serious AEs leading to death were rare in both phase 1 and late-phase trials. Despite the overall large number of patients observed in most phase 1 studies, our findings are limited by the small number of immune checkpoint clinical trials published, which increases the chance of type 1 and 2 errors. In addition as this is a manuscript-based study we were limited in conducting further explorations of associations between study population characteristics and the predictability of toxicities in phase 1 trials, which would possible if we had access data at the patient level from each trial included. Furthermore, phase 1 trials herein reviewed were not matched with late-phase studies of equivalent dose and schedule of administration. The latter observation comes as a consequence of the fact that treatment regimens of anti-PD-1 and PD-L1 antibodies assessed in late-phase trials have been proposed based on pharmacokinetic and pharmacodynamic studies as opposed to conventional phase 1 endpoints (i.e., MTD, DLT, and recommended phase 2 dose level and schedule). Our results are important in that they validate the results of safety endpoints in phase 1 anti-PD-1 and PD-L1 treatments as a tool to predict toxicities in late-phase trials likely as a function of lager sample size of phase 1 clinical trials. This is important, as recent phase 1 checkpoint inhibitor trials have incorporated efficacy end points, which showed early treatment efficacy and led direct drug study in phase 3 trials [17,18]. As newer therapies with distinct mechanisms of action continue to emerge the field of cancer drug development will need to continuously assess the appropriateness of phase 1 trials design in predicting toxicities in late-phase trials.

Search strategy
Package insert for currently approved anti-PD1 and PD-L1 antibodies were searched for references on late-phase approval trials (i.e., phase 2 and 3 trials) and phase 1 trials. For the PubMed search, the following keywords or corresponding Medical Subject Heading terms were used: "nivolumab" "pembrolizumab", "atezolizumab". The key word "ipilimumab" was also used to increase the sensitivity of our search, as phase 2 and 3 trials assessing the efficacy of ipilimumab combined or compared with anti-PD1/PD-L1 inhibitors exist. No filters or language limit was used to maximize search sensitivity. The database was searched for articles published until December 27 th 2016.

Selection and matching of trails and data extraction
Late-phase trials were required to meet the following inclusion criteria: (i) at least one of the study arms consisting of nivolumab, pembrolizumab or atezolizumab monotherapy; (ii) solid tumor non-pediatric population, (iii) phase 2 trials were included only if they were used to support drug approval (Drug-approving phase 2 trials were allowed as our analysis aimed to assess the safety of current drug approval strategy of checkpoint inhibitors), (iv) late-phase trials not containing at least one of the antibodies under study as monotherapy were excluded. Phase 1 trials were required to meet the following inclusion criteria: (i) dose-finding study assessing toxicity of nivolumab, pembrolizumab or atezolizumab as monotherapy; (ii) solid tumor non-pediatric population. Finally matching between late-phase and phase 1 trials was performed according the following criteria: (i) compound with more than one matching phase 1 trial; larger matching phase 1 trial will be used for data analysis, (ii) the phase 1 clinical trial had similar patient population compared to the late-phase trial, (iii) when a late-phase trial could not be matched with a phase 1 single histology trial, the phase 1 multiple histology with larger number of patients with a given tumor histology was matched with the late-phase trial with corresponding tumor histology. For late-phase trials meeting inclusion criteria tumor histological type (melanoma vs. non-melanoma), and dose regimen were also documented. For phase 1 trials tumor histological type, number of dose levels, presence of doselimiting toxicities were documented.
For all clinical trials, the first author's name, date of publication, study phase, and type of PD-1 or PD-L1 inhibitor were documented. The number of patients evaluable for toxicity and the number of adverse events (AEs) of phase 3 study arms containing the same PD-1 or PD-L1 inhibitor at different doses within the same clinical trial were summed for analysis. According to the National Cancer Institute Common Terminology Criteria Adverse Events (NCI CTCAE) version 4.0, the number of all grade, grade 3 and 4, grade 5, and potentially irAEs were extracted. The number of all grade potentially irAEs were collected according to study arm [rash, pruritus, vitiligo, diarrhea, colitis, hypopituitarism/hypophysitis, adrenal insufficiency, hypothyroidism, hyperthyroidism, hyperglycemia, non-infectious pneumonitis]. In light of the expected low number of irAEs, all grades of selected AEs were extracted. The four most common AEs observed in each phase 1 and late-phase study was also documented.

Concordance of most common AEs in phase 1 and late-phase trials
The 4 most frequently observed all-grade AEs in both phase 1 and late-phase trials were documented. Matched late-phase and phase 1 trials were considered to have concordance frequencies of most common AEs if at least 50% (3 out of 4) of the 4 most common AEs in phase 1 seen among most common AEs in later phase trials.

Statistical methods
Data were reported as frequencies and percentages at the study level. P-values to assess differences between matches and non-matches of phase 1 and late-phase trials and between early and late-phase trials themselves were obtained via Fisher's exact test. Odds ratios were obtained via logistic regression. P-values and odds ratios were obtained via logistic regression for the subject level analyses between early and late phase trials. Analyses were conducted in SAS v9.4.

Author contributions
R. Costa (study design, data collection, data interpretation, and manuscript writing), R. B. Costa (data interpretation and manuscript writing), S. M. Talamantes (data interpretation and manuscript writing), I Helenoswki (data analysis and manuscript writing), B. A. Carneiro (data interpretation and manuscript writing), Y. K. Chae (data interpretation and manuscript writing), W. J. Gradishar (data interpretation and manuscript writing), R. Kurzrock (study design, data interpretation, and manuscript writing), F. J. Giles (study design, data interpretation, and manuscript writing).

CONFLICTS OF INTEREST
R. Kurzrock has research funding from Genentech, Merck Serono, Pfizer, Sequenom, Foundation Medicine, and Guardant, as well as consultant/advisory board fees from Actuate Therapeutics and XBiotech, and an ownership interest in Curematch, Inc.
R. Costa receives research funds from Bristol-Myers Squibb.
All other authors have not significant disclosures.

FUNDING
The Woman's Board of Northwestern Memorial Hospital.