by Douglas DiRuggiero, DMSc, PA-C; Cynthia Trickett, PA-C, MPAS; Lauren Hippeli, MS; Sang Hee Park, MPH; Amy Baum-Jones, MPAS, PA-C; and David S. Davidson, MS, PA-C
Dr. DiRuggiero is with the Skin Cancer and Cosmetic Dermatology Center in Rome, Georgia. Ms. Trickett is with North Dallas Dermatology Associates in Dallas, Texas. Ms. Hippeli, Ms. Park, and Ms. Baum-Jones are with Bristol Myers Squibb in Princeton, New Jersey. Mr. Davidson is with Fallon Medica in Tinton Falls, New Jersey, and was an employee of Bristol Myers Squibb at the time of manuscript development.
Funding: This review was sponsored by Bristol Myers Squibb. Writing and editorial assistance was provided by Jieming Fang, MD, and Ann Marie Fitzmaurice, PhD, of Peloton Advantage, LLC, an OPEN Health company, funded by Bristol Myers Squibb.
Disclosures: DD has served as a speaker and consultant for AbbVie, Amgen, Arcutis, Bristol Myers Squibb, Eli Lilly, EPI Health, Incyte, Janssen, Leo Pharma, Novartis, Pfizer, Sanofi Regeneron, and UCB. CT has served as a speaker and consultant for AbbVie, Amgen, Bristol Myers Squibb, Eli Lilly, Incyte, Janssen, Journey Medical Corporation, Leo Pharma, Ortho Pharmaceutical, Pfizer, Regeneron, Sanofi Genzyme, Sun Pharmaceutical, and UCB. LH, SHP, and AB-J are employees of and shareholders in Bristol Myers Squibb. DSD was an employee of and shareholder in Bristol Myers Squibb at the time this review was first developed.
J Clin Aesthet Dermatol. 2024;17(7–8 Suppl 1):S15–S24.
Abstract
Numerous clinical trials have established that various biologic and oral small-molecule therapies are efficacious in patients with psoriasis. However, as there are limited head-to-head trials, healthcare providers may compare results across multiple trials when providing treatment recommendations. Direct comparisons among agents are challenging because psoriasis trials differ in terms of study design, patient population, and data analysis methodologies. Long-term clinical trials present additional challenges because the number of patients enrolled generally declines over time. The missing patient data that might occur, coupled with the specific approach used to substitute or impute that missing data, might introduce bias and skew efficacy results. In this review, we discuss how variations in study design and analytical methodologies affect efficacy outcomes in clinical trials. We also review published trials of biologic and oral small-molecule therapies for psoriasis to illustrate how issues related to missing data and choices in data imputation methodologies can affect the interpretation of efficacy outcomes. Imputation methodologies discussed include nonresponder imputation, modified nonresponder imputation, treatment failure rules, last observation carried forward, modified baseline observation carried forward, and multiple imputation. This review provides a foundation for the healthcare provider’s critical evaluation of the psoriasis literature and emphasizes the importance of considering the level of evidence provided in a clinical trial when making treatment decisions.
Keywords: Data imputation, psoriasis, statistical analysis, study design
Introduction
Plaque psoriasis is a common, chronic, inflammatory skin disease that can impair physical function, career choice, and quality of life.1 Biologic and oral small-molecule agents are efficacious therapeutic options for many patients with moderate-to-severe disease.2 Selecting an appropriate therapy is complex, as healthcare providers need to evaluate the quality of individual clinical trials and compare results across multiple clinical trials, which is difficult owing to the diverse study designs and statistical methodologies used.3 In addition, limited direct head-to-head comparisons and consistent analyses between treatment arms in different trials makes recommendations challenging.4 Furthermore, with long-term clinical trials, the number of patients enrolled generally decreases over time, and analyses used to substitute or impute missing data might introduce bias and skew clinical trial results.3 In this review, we discuss how variations in study designs and analytical methodologies can affect reported clinical efficacy outcomes. We also review published trials of biologic and oral small-molecule therapies for psoriasis to illustrate how issues related to study design, missing data, and choices in data-handling strategies might affect the interpretation of clinical outcomes. These discussions might be relevant to healthcare providers who must evaluate an increasing volume of clinical trial results in the context of varying study designs and analytical methodologies when seeking to provide evidence-based treatment recommendations to their patients.
Study Design-related Considerations
Randomized, controlled trials (RCTs) are considered the gold standard for evaluating the efficacy and safety of new therapies.3 However, these trials can differ in terms of study design-related parameters, such as type of trial, eligibility criteria, sample size, study power, efficacy analysis populations, and other statistical methodologies.3 Understanding these parameters and how they can affect efficacy outcomes is necessary to make informed treatment decisions. Additional study design-related parameters that affect the interpretation of clinical trial results are reviewed in detail elsewhere.3
Clinical trial design. RCTs can be designed to demonstrate superiority, noninferiority, or equivalence (Table 1).5 The objective of a superiority trial is to demonstrate that a new therapy is significantly better than a comparator (standard therapy or placebo). The objective of a noninferiority trial is to demonstrate that a new therapy is noninferior to, or not worse than, a comparator (standard therapy) beyond a specified margin of clinical significance.The objective of an equivalence trial is to demonstrate that a new therapy is equivalent to a comparator (standard therapy) within a specified margin of clinical significance.5 Active comparator trials are common in dermatology, with numerous superiority, noninferiority, and equivalence trials conducted in patients with psoriasis.4 However, as discussed below (see “Statistical significance and clinical relevance”), considerable variability exists in the conduct and reporting of these trials, including the maximum acceptable treatment difference used in each trial.4
Each trial type is designed to pose null and alternative hypotheses. For instance, as shown in Table 1, the null hypothesis is proven in a superiority trial when there is no significant difference between the new treatment and the comparator, and the alternative hypothesis is proven when there is a significant difference between the new treatment and the comparator. Type I and II errors might arise when the null and alternative hypotheses are falsely accepted or rejected, respectively.5 Type I errors (false positives) are defined as erroneously concluding superiority, noninferiority, or equivalence of the new therapy compared with the comparator when none exists; Type II errors (false negatives) are defined as not concluding superiority, noninferiority, or equivalence when they exist.5 Interpretation of the design and results of superiority, noninferiority, and equivalence trials might be confusing because these trials have different objectives; the objective of a superiority trial is to prove that the new therapy is better than the comparator, whereas the objectives of noninferiority and equivalence trials are to prove that the new therapy is not worse than or is equivalent to the comparator, respectively.5
Clinical trials often use treatment crossover to evaluate the efficacy of new psoriasis treatments. Patients enrolled in a clinical trial with treatment crossover receive a sequence of treatments rather than a single treatment, and the effect of this sequence is evaluated in the same patient. For example, in the Phase III POETYK PSO-1 (NCT03624127) and POETYK PSO-2 (NCT03611751) clinical trials, patients with moderate-to-severe plaque psoriasis were randomized to receive deucravacitinib, placebo, or apremilast, allowing direct comparison between two therapies.6,7 Patients randomized to placebo crossed over to deucravacitinib at Week 16 (time of co-primary endpoint assessment) to determine whether clinical response could be achieved in patients who initiated active treatment after receiving placebo for 16 weeks.6,7 In POETYK PSO-2, patients treated with deucravacitinib or apremilast who achieved 75-percent or greater reduction from baseline in the Psoriasis Area and Severity Index (PASI75) at Week 24 were rerandomized to deucravacitinib or placebo to evaluate maintenance and durability of response. The POETYK PSO-2 trial was designed to allow patients who experienced a relapse of disease after randomized withdrawal from active treatment to restart treatment to assess recapture of response.7 Similarly, the Phase III ESTEEM 1 (NCT01194219) and ESTEEM 2 (NCT01232283) trials randomized patients with moderate-to-severe plaque psoriasis to apremilast or placebo.8,9 Patients receiving placebo crossed over to apremilast at Week 16; patients initially randomized to apremilast who achieved PASI75 (ESTEEM 1) or PASI50 (ESTEEM 2) at Week 32 were rerandomized to receive apremilast or placebo until Week 52.8,9 These trials illustrate how a design that incorporates treatment crossover can be used to evaluate whether psoriasis therapies are efficacious both in terms of maintaining clinical efficacy and recapturing response after treatment withdrawal.
Eligibility criteria. Clinical trials aim to enroll a patient population representative of the target population.3 If the clinical trial population is genuinely representative of the target population, it can be inferred that the findings from a clinical trial are generalizable to the target population, referred to as external validity.3,10 However, selection bias can occur when patients enrolled in a clinical trial are not representative of the target population.3
Inclusion and exclusion criteria are used to define the appropriate patient population for any given clinical trial. Subtle differences in eligibility criteria might skew the composition of the patient population and bias efficacy outcomes. Therefore, it is important to understand differences in inclusion and exclusion criteria when comparing results across clinical trials. For example, disease severity in patients enrolled in clinical trials might be greater than that in the target population because the effect size needed to demonstrate a statistical difference between treatment groups requires patients with sufficient disease severity. As such, clinical trials in moderate-to-severe plaque psoriasis typically enroll patients with PASI of 12 or greater, body surface area (BSA) involvement of 10 percent or greater, and static Physician Global Assessment score (sPGA) of 3 or greater,6–9 whereas treatment guidelines define the moderate-to-severe plaque psoriasis population as BSA involvement of three percent or greater or the presence of severe disease in specific areas.11 Additionally, efficacy outcomes might be affected if patients are not permitted to enter a trial unless they have previously failed to respond to specific treatments. This can result in trial enrollment of a patient population with comparatively more severe and refractory disease compared with a trial that did not impose such limitations. In this case, differences in efficacy outcomes might be more attributable to patient selection than to treatment benefit.
Many of the more recently completed Phase III trials of biologic and oral small-molecule therapies have enrolled patients with psoriasis who were candidates for systemic therapy and/or phototherapy, including those with a history of prior therapy, such as biologic therapy, after specified washout periods.7,8,12–14 Phase III trials of the oral therapies apremilast and deucravacitinib and the biologics risankizumab, tildrakizumab, and guselkumab have typically only excluded patients who had previously received the studied agent, closely related agents targeting the same pathway, and trial comparators.7,8,12–14 This represents an effort by the dermatology community to make clinical trial results more applicable to patients with psoriasis in the general population who have received prior therapy.
Clinical trials in psoriasis should accurately reflect the patient population affected by this disease. As a result, clinical trials generally include patients with psoriasis from diverse groups because patients might react differently to new treatments depending on race, ethnicity, and other factors.15 The United States (US) Food and Drug Administration (FDA) has recently developed a guidance for industry aimed at supporting plans to increase participation of patients from underrepresented racial and ethnic populations in clinical trials in the US.15
Selection of eligibility criteria might also affect the interpretation of safety outcomes in clinical trials. For example, claims that a new therapy is not associated with cardiovascular events should be interpreted cautiously in situations where patients who have a history of or received treatment for cardiovascular disease were excluded from clinical trial participation or where patient participation was limited to those with no more than one or two cardiovascular disease entities (e.g., hypertension, diabetes mellitus, arrhythmia, obesity, and smoking).
Efficacy analysis population. The efficacy analysis population, defined as the patient population used to determine efficacy outcomes, is an important factor that might differ among clinical trials. Various efficacy analysis populations are used in clinical trials, with the full analysis set and the per-protocol analysis set being among the most common. An example of an efficacy analysis population is the full analysis set, which is commonly defined as all randomized patients who are typically analyzed according to the treatment group assigned at randomization (intention-to-treat [ITT] principle), even if some patients did not enter or complete the trial. Use of the full analysis set prevents the selective analysis of a subset of patients, which might affect the baseline equivalence established during the initial randomization process and create systematic bias, resulting in overestimation or underestimation of efficacy response rates.16 The per-protocol analysis set, a subset of the full analysis set, is generally defined as patients who are compliant with study treatment and who do not have any relevant protocol deviations that may affect primary efficacy outcomes. Similar to the full analysis set, the per-protocol analysis set is analyzed according to the treatment group assigned at randomization. Generally, the full analysis set is the primary efficacy analysis population in superiority trials because it is more conservative and reduces the bias arising from treatment nonadherence and loss to follow-up; the per-protocol analysis set serves as a supportive efficacy analysis population.4,6,7,17 Both full and per-protocol analyses are used in noninferiority and equivalence trials; use of the full analysis only might increase Type I error when inferiority is correct, especially in situations where treatment nonadherence is common.4,5,17 Despite this recommendation, a systematic review of active comparator-controlled trials of systemic and biologic therapies in psoriasis reported that only approximately one-third of noninferiority and equivalence trials performed both full and per-protocol analyses.4 Readers should be careful when interpreting noninferiority and equivalence trial results not based on both types of analyses.
During the randomized phase of a trial, the efficacy analysis set is often well defined and approved by regulatory authorities. However, a trial that enrolls a responder-enriched analysis set, comprising patients who have previously demonstrated a strong response to a specific therapy, might be expected to yield artificially elevated efficacy response rates compared to trials that enroll a patient population not stratified based on prior response. Responder analyses in psoriasis trials are frequently based on the achievement of PASI75; PASI75 responders may be rerandomized at a specific time point to continue active treatment to evaluate maintenance of response or to withdraw to placebo followed by crossover back to active treatment to evaluate loss or durability of response and subsequent recapture of response.8,9 Readers should be cautious when interpreting results from responder analyses, as these are more exploratory and based on subgroups that are defined post-randomization.
Sample size and statistical power. The number of patients included in a clinical trial (sample size; also known as “N”) and the statistical power (probability of detecting a difference between treatment groups when one exists) are important considerations in both clinical trial design and interpretation of results. Sample size and statistical power calculations can be quite complex and are generally based on the primary efficacy endpoint and other key elements (e.g., key secondary endpoints and safety database requirements).3 Sample size calculations consider various factors, including study design, whether the intent is to establish that populations do or do not differ, level of statistical significance, whether an observed difference must be in one direction or could be in both directions, study power, effect size, and extent of within-arm variability.3 Whether a trial is designed to demonstrate superiority or noninferiority is an important consideration, with noninferiority trials typically being larger than superiority trials.5 In practical terms, sample size is frequently determined based on preliminary studies of the active treatment, placebo, and/or comparator that provide an estimate of the expected effect size.6,8,9,18 For example, if early-phase clinical trials demonstrate that a new psoriasis therapy is efficacious in 50 percent of patients and placebo in 10 percent of patients, a sample size of 18 patients per group would provide 80-percent power to detect a difference between groups (80% chance of showing a statistically significant difference) with p less than 0.05 (difference that would occur by chance alone <5% of the time).
Clinical trials must have an adequate sample size and statistical power to detect clinically relevant differences between treatment groups. As the sample size and statistical power increase, the ability to detect progressively smaller differences between treatment groups increases.19 If a clinical trial has adequate statistical power, it can be asserted with reasonable confidence that any observed differences between treatment groups are genuine and can be generalized to the patient population.3 Failure to detect a difference can result from inadequate sample size and statistical power.19 Conversely, a clinical trial with a very large sample size and too much statistical power (overpowered) might detect differences that are not clinically relevant.19 Ideally, a clinical trial should have adequate power such that observed differences under the alternative hypothesis are both statistically significant and clinically relevant. Finally, sample size and statistical power calculations should be reported in clinical trial publications. Readers should be cautious when interpreting clinical trial results that do not mention sample size and statistical power analyses.
Statistical significance (p-values) and clinical relevance. The concepts of statistical significance and clinical relevance are at the heart of clinical research. Statistical significance occurs when observed differences between treatment groups are shown to be unlikely to have occurred by chance. Clinical relevance is determined primarily based on a target treatment effect size and a defined minimum clinically important difference (MCID), as well as personal clinical experience, consultation with medical experts, and review of the applicable literature. The target treatment effect size denotes the magnitude and direction of the difference between treatment groups.20 The MCID is defined as the clinical threshold that the treatment difference must meet in order to be clinically relevant. Distinguishing between statistical significance and clinical relevance is important because a clinical trial might report differences between treatment groups that are statistically significant but not clinically relevant.
The Consolidated Standards of Reporting Trials (CONSORT) for RCTs recommend using a p-value and confidence interval (CI) approach for declaring the superiority of one treatment over another.21 A p-value is the probability of observing an effect size as large as or larger than the result actually observed when the interventions did not truly differ.21 By convention, a p-value less than 0.05, known as the significance level, is often used as the cutoff in determining statistical significance.3 The null hypothesis is rejected when p is less than 0.05 (statistically significant difference), which means that if the null hypothesis were true, the probability of its being rejected is limited to five percent.3 In contrast, if p is greater than 0.05, there is insufficient evidence to reject the null hypothesis (but the null hypothesis is not proven).3
Although p less than 0.05 is often used as the cutoff in determining statistical significance, other factors must be considered in certain circumstances. For example, if multiple pairwise associations of different treatments are performed to evaluate their relative efficacy, results should be adjusted for multiple testing. This suggestion is based on the premise that if the null hypothesis of every statistical test is true, indicating that there are no differences between treatment groups, the probability of making at least one Type I error increases with the number of tests performed.3 For example, in the worst case scenario, pairwise independent comparisons (used to compare treatment groups two at a time based on specific endpoints, with each comparison unaffected by the outcomes of other comparisons) among three treatment groups with a significance level of 0.05 results in nearly a 15-percent chance (1−0.95 x 0.95 x 0.95 = 0.143) of incurring at least one Type I error.3 Correction should therefore be made for multiple testing, and a more stringent cutoff for statistical significance might be necessary. There is no universally standard approach to correct for multiple testing, and an appropriate method should be selected on an individual basis.3
Many clinical trials report results using only the p-value approach; however, p-values have limitations in that they do not account for the magnitude or direction of the observed difference between treatment groups, and they can be misinterpreted.22 CIs can be more informative, especially when results are not significant. The usefulness of CIs does not depend on the p-value. CIs are defined as a range of numbers obtained from a point estimate of a parameter. A confidence level is the confidence that the CI contains the parameter. A confidence level defined in advance at 95 percent means that the CI includes the true value in 95 percent of analyses performed.22 A higher confidence level means that the CI will be broader. The width of the CI also depends on the sample size, with larger sample sizes resulting in narrower CIs and smaller sample sizes resulting in broader CIs.22
Determining whether a new therapy is superior, noninferior, or equivalent to a comparator is possible using information provided by 95-percent CIs (assuming Type I error=0.05). Superiority is achieved when the difference between the means of efficacy endpoints for the new therapy and the comparator is greater than zero and the two-sided 95-percent CI of the difference does not cross zero (ratio >1 and the CI does not cross 1 for odds and risk ratios; Figure 1).4 Noninferiority is achieved when the lower limit of the 95-percent CI of the difference between means is greater than the lower noninferiority margin.4,17 Equivalence is achieved when the 95-percent CI of the difference between means is within the upper and lower equivalence margins.4
Determination of the noninferiority margin is an important aspect in clinical trial design; this is generally achieved by establishing that the comparator produces a defined treatment effect at a defined sample size based on an analysis of the original placebo-controlled efficacy trials.17 Given that there are currently no agreed upon standards for establishing the noninferiority margin, a wide range of margins in trials addressing similar hypotheses are possible. For example, a systematic review of 49 active comparator-controlled trials of systemic and biologic therapies for psoriasis reported a wide range of acceptable treatment differences for each trial type (superiority: 14–20%; noninferiority: −20% to −10%; equivalence: ±12.5 to ±18%).4 Selection of different noninferiority margins might result in similar trial types with comparable treatment effect sizes being interpreted differently by healthcare providers who differ in their opinions of the maximum acceptable difference between interventions.
Imputation of Missing Clinical Data
Challenges in the interpretation of long-term clinical data. Psoriasis is a chronic disease, with approximately 90 percent of patients requiring long-term therapy.16 Therapeutic options with favorable long-term efficacy and safety profiles are likely to exert the largest beneficial effect on the patient’s disease course.16 Therefore, reliable long-term data are crucial in evaluating the clinical efficacy of a new therapy. However, long-term clinical trials face challenges related to data collection and analysis, with the most notable being the difficulty posed by missing clinical data. Various factors contribute to missing data, including premature patient discontinuation due to loss of efficacy or development of adverse events, patients missing or having incomplete assessments or assessments conducted outside the visit window, and patient loss to follow-up.16 Missing data complicate the interpretation of study findings and might bias the results, especially as patient numbers progressively decline over the course of long-term trials.16,23 Additionally, missing patient data reduce the statistical power needed to identify differences between treatment groups because fewer patients contribute data to the analyses.16
Guidelines for handling missing patient data. The ITT principle requires that all randomized patients be included in data analyses regardless of treatment completion or discontinuation or adherence to the study protocol.24 The FDA and the European Medicines Agency support the ITT principle, prespecification of the methods for handling missing patient data in the protocol or statistical analysis plan, and a conservative approach for imputation of missing data that excludes bias in favor of the new therapy.24–26 To ensure that data are analyzed and presented appropriately, several imputation methods should be conducted in trials with missing patient data for any reason to assess the sensitivity of clinical outcomes.23 Additionally, reasons for patient withdrawal and loss to follow-up should be documented in the study publication, which will allow readers to assess the potential effect of patient dropouts on between-group comparisons.
Strategies for handling missing patient data. Strategies used to handle missing patient data, such as replacement with substituted values (imputation), might also introduce bias that affects the interpretation of clinical trial results.24 Single-imputation strategies, in which a single plausible value replaces each missing observation, confer an impression of certainty that might not be justified, thereby affecting the interpretation of clinical trial outcomes.24 Multiple imputation (MI) strategies, in which multiple plausible values replace each missing observation, better estimate the true value and introduce less bias than single imputation, because multiple-imputed values reflect the uncertainty in the estimation.24 Introduction of bias during the data imputation process might result in an inaccurate and imprecise estimation of treatment effect, thereby limiting the ability to perform between-group comparisons.24
Several standard strategies are used alone or in combination to address the issue of missing patient data, including strict nonresponder imputation (NRI), modified NRI (mNRI), treatment failure rules (TFR), last observation carried forward (LOCF), modified baseline observation carried forward (mBOCF), MI, and observed values (Table 2).6,16,23,24,27 Strategies such as NRI, mNRI, and TFR have been used to analyze binary variables (e.g., responder endpoints) in clinical trials of oral small-molecule agents and biologic agents used to treat psoriasis (Table 3). Strategies such as LOCF, mBOCF, and observed values have been used for the analysis of continuous variables in clinical trials of psoriasis therapeutics.6,8
Strict NRI is a conservative single-imputation strategy used to account for missing data in that this analysis considers patients who have been lost to follow-up as nonresponders.24 Substituting worst-case nonresponse values for all missing values has the advantage of not skewing efficacy results in favor of the new therapy. However, this strategy might underestimate efficacy because it considers patients who achieved a favorable response but dropped out of a trial for reasons unrelated to treatment, such as relocation, as nonresponders.16,23,24 The US prescribing information and the European summary of product characteristics of the oral small-molecule agent deucravacitinib present response rates using the NRI strategy to impute missing data.28,29
In contrast, mNRI is an imputation strategy that attempts to reduce the potential negative bias associated with strict NRI by imputing a broader range of values than nonresponse. Patients who discontinue treatment due to reasons such as worsening of psoriasis are imputed as nonresponders;30 other missing data are analyzed using MI (described below).
TFR is another single-imputation strategy that imputes nonresponse only for treatment failures, defined as patients who discontinue treatment for reasons such as lack of efficacy, adverse events, worsening of disease, or use of a protocol-prohibited medication.27
LOCF, another single-imputation strategy, assigns the response recorded during the last visit to all subsequent missed visits.16 LOCF might overestimate response rates because it assumes that patients who have been lost to follow-up but had previously responded to treatment continue to respond after discontinuation, although these patients might have lost response.16,24 This is especially problematic in longer-term trials with high patient withdrawal rates due to loss of response over time; imputation of a large number of missing values as responders will probably introduce bias and overestimate the actual response rate.16,24 Conversely, LOCF might underestimate response in cases where nonresponders begin to respond shortly after treatment discontinuation. LOCF, especially when applied to longer-term clinical trials, is similar to real-world situations where patients are frequently lost to follow-up.
The single-imputation strategy mBOCF may also be used for missing data; in this approach, the baseline observation is carried forward for patients who discontinued treatment due to lack of efficacy or adverse events.6 The last valid observation is carried forward for all other patients with missing data.6 Therefore, mBOCF may be regarded as a combination of the BOCF and LOCF data imputation approaches.
MI addresses the uncertainty related to missing data by generating multiple complete data sets, which are analyzed separately and combined to obtain estimates for overall treatment differences and associated errors.24 MI provides more accurate precision estimates of clinical response rates than single imputation and is generally regarded as a state-of-the-art approach to data imputation.24
Observed values analyze only data that have been directly observed and do not impute any values for patients who have dropped out of the trial.16,23,24 This strategy accurately assesses response rates in patients who remain in the trial, but it does not capture important information about patients who discontinued treatment.16,23,24 Use of only observed values might overestimate clinical response rates because patients remaining in the trial are more likely to have achieved therapeutic benefit.16,23,24
Finally, it is also worth noting that certain disparities exist in the psoriasis literature, with data imputation strategies having different definitions in some publications.24,27 Therefore, when interpreting clinical trial results, it is important not only to identify the specific data imputation strategy employed in a particular trial, but also to understand that there might be differences in how specific imputation strategies are defined in individual trials.
Illustration of imputation of missing data in clinical trials. Figure 2 illustrates the effect of the various methods for imputation of missing data on response rates in a hypothetical clinical trial. Response rates, which increase as the data imputation strategy becomes less conservative, range from 60 percent with NRI to 75 percent with observed values.16 Phase III trials of oral small-molecule agents and biologics, such as apremilast, deucravacitinib, secukinumab, risankizumab, tildrakizumab, and guselkumab, have used several data imputation methodologies, including NRI, mNRI, TFR, LOCF, MI, and observed values (Table 3).6–9,12–14,24,31–35 Efficacy response rates for these agents were generally consistent across evaluated imputation methodologies, and differences tended to be slightly greater at Week 52 compared to Week 12 or Week 16 (primary endpoint assessment).6–9,12–14,24,32–35 Analysis of long-term extension trials of these agents demonstrated that efficacy responses tended to decline with increasing stringency of the data imputation methodology; differences in response rates were generally more pronounced during the long-term extension trials compared to the Phase III trials, probably because of higher percentages of missing patient data as the trials progressed (data not shown).27,30,33
These findings suggest that the choice of a particular data imputation methodology affects efficacy response rates. Therefore, comparison of the efficacy of oral small-molecule agents and biologics in psoriasis should be undertaken carefully; depending on choices made about how missing data should be handled, a new psoriasis therapy might appear more or less efficacious than a comparator. When less conservative imputation methods, such as LOCF or observed values, are used, more conservative approaches, such as NRI, should also be presented to give a complete and balanced picture of efficacy responses. Understanding the differences in the imputation strategies used to handle missing data is essential for making meaningful comparisons of efficacy across different therapies and providing patients with evidence-based treatment recommendations.
Conclusion
Although cross-trial comparisons are necessary to make informed treatment decisions in the real world, clinical evidence of the highest quality is derived from well-designed, comparative, randomized, controlled, clinical trials. However, these types of clinical trials are not always available, and comparing results from separate trials might be misleading. Complex clinical trial design and data analysis strategies, in addition to other aspects such as the patient population and timing of the trial, pose data interpretation challenges for healthcare providers seeking to compare efficacy outcomes across different clinical trials. An understanding of how differences in design- and analysis-related parameters affect efficacy response rates and limit the validity of side-by-side comparisons is necessary for well-informed interpretation of clinical trial results.
References
- Greb JE, Goldminz AM, Elder JT, et al. Psoriasis. Nat Rev Dis Primers. 2016;2:16082.
- Balogh EA, Bashyam AM, Ghamrawi RI, Feldman SR. Emerging systemic drugs in the treatment of plaque psoriasis. Expert Opin Emerg Drugs. 2020;25(2):89–100.
- Silverberg JI. Study designs in dermatology: practical applications of study designs and their statistics in dermatology. J Am Acad Dermatol. 2015;73(5):733–740; quiz 741–732.
- Wan MT, Alvarez J, Shin DB, et al. Head-to-head trials of systemic psoriasis therapies: a systematic review of study design and maximum acceptable treatment differences. J Eur Acad Dermatol Venereol. 2019;33(1):42–55.
- Kishore K, Mahajan R. Understanding superiority, noninferiority, and equivalence for clinical trials. Indian Dermatol Online J. 2020;11(6):890–894.
- Armstrong AW, Gooderham M, Warren RB, et al. Deucravacitinib versus placebo and apremilast in moderate to severe plaque psoriasis: efficacy and safety results from the 52-week, randomized, double-blinded, placebo-controlled phase 3 POETYK PSO-1 trial. J Am Acad Dermatol. 2023;88(1):29–39.
- Strober B, Thaçi D, Sofen H, et al. Deucravacitinib versus placebo and apremilast in moderate to severe plaque psoriasis: efficacy and safety results from the 52-week, randomized, double-blinded, Program fOr Evaluation of TYK2 inhibitor psoriasis second phase 3 trial. J Am Acad Dermatol. 2023;88(1):40–51.
- Papp K, Reich K, Leonardi CL, et al. Apremilast, an oral phosphodiesterase 4 (PDE4) inhibitor, in patients with moderate to severe plaque psoriasis: results of a Phase III, randomized, controlled trial (Efficacy and Safety Trial Evaluating the Effects of Apremilast in Psoriasis [ESTEEM] 1). J Am Acad Dermatol. 2015;73(1):37–49.
- Paul C, Cather J, Gooderham M, et al. Efficacy and safety of apremilast, an oral phosphodiesterase 4 inhibitor, in patients with moderate-to-severe plaque psoriasis over 52 weeks: a Phase III, randomized controlled trial (ESTEEM 2). Br J Dermatol. 2015;173(6):1387–1399.
- Murad MH, Katabi A, Benkhadra R, Montori VM. External validity, generalisability, applicability and directness: a brief primer. BMJ Evidence-Based Med. 2018;23(1):17–19.
- Menter A, Gelfand JM, Connor C, et al. Joint American Academy of Dermatology-National Psoriasis Foundation guidelines of care for the management of psoriasis with systemic nonbiologic therapies. J Am Acad Dermatol. 2020;82(6):1445–1486.
- Gordon KB, Strober B, Lebwohl M, et al. Efficacy and safety of risankizumab in moderate-to-severe plaque psoriasis (UltIMMa-1 and UltIMMa-2): results from two double-blind, randomised, placebo-controlled and ustekinumab-controlled Phase 3 trials. Lancet. 2018;392(10148):650–661.
- Reich K, Papp KA, Blauvelt A, et al. Tildrakizumab versus placebo or etanercept for chronic plaque psoriasis (reSURFACE 1 and reSURFACE 2): results from two randomised controlled, Phase 3 trials. Lancet. 2017;390(10091):276–288.
- Blauvelt A, Papp KA, Griffiths CE, et al. Efficacy and safety of guselkumab, an anti-interleukin-23 monoclonal antibody, compared with adalimumab for the continuous treatment of patients with moderate to severe psoriasis: results from the Phase III, double-blinded, placebo- and active comparator-controlled VOYAGE 1 trial. J Am Acad Dermatol. 2017;76(3):405–417.
- United States Food and Drug Administration. Diversity plans to improve enrollment of participants from underrepresented racial and ethnic populations in clinical trials. Apr 2022. https://www.fda.gov/media/157635/download. Accessed 13 Mar 2023.
- Langley RG, Reich K. The interpretation of long-term trials of biologic treatments for psoriasis: trial designs and the choices of statistical analyses affect ability to compare outcomes across trials. Br J Dermatol. 2013;169(6):
1198–1206. - Kim KS, Chan AW, Belley-Côté EP, Drucker AM. Noninferiority randomized controlled trials. J Invest Dermatol. 2022;142(7):1773–1777.
- Papp K, Gordon K, Thaci D, et al. Phase 2 trial of selective tyrosine kinase 2 inhibition in psoriasis. N Engl J Med. 2018;379(14):1313–1321.
- Bhardwaj SS, Camacho F, Derrow A, et al. Statistical significance and clinical relevance: the importance of power in clinical trials in dermatology. Arch Dermatol. 2004;140(12):1520–1523.
- McGough JJ, Faraone SV. Estimating the size of treatment effects: moving beyond p values. Psychiatry (Edgmont, Pa). 2009;6(10):21–29.
- Moher D, Hopewell S, Schulz KF, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c869.
- Kim N, Fischer AH, Dyring-Andersen B, et al. Research techniques made simple: choosing appropriate statistical methods for clinical research. J Invest Dermatol. 2017;137(10):
e173–e178. - Papp KA, Fonjallaz P, Casset-Semanaz F, et al. Analytical approaches to reporting long-term clinical trial data. Curr Med Res Opin. 2008;24(7):2001–2008.
- Langley RGB, Reich K, Papavassilis C, et al. Methods for imputing missing efficacy data in clinical trials of biologic psoriasis therapies: implications for interpretations of trial results. J Drugs Dermatol. 2017;16(8):734–741.
- European Medicines Agency. Guideline on Missing Data in Confirmatory Clinical Trials. 2 Jul 2010. https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-missing-data-confirmatory-clinical-trials_en.pdf. Accessed 8 Dec 2022.
- United States Food and Drug Administration. Data Retention When Subjects Withdraw from FDA-Regulated Clinical Trials. Oct 2008. Current as of 10 Apr 2019. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/data-retention-when-subjects-withdraw-fda-regulated-clinical-trials. Accessed 8 Nov 2022.
- Griffiths CEM, Papp KA, Song M, et al. Continuous treatment with guselkumab maintains clinical responses through 4 years in patients with moderate-to-severe psoriasis: results from VOYAGE 1. J Dermatolog Treat. 2022;33(2):848–856.
- Sotyktu [package insert]. Princeton, NJ, USA: Bristol Myers Squibb; September 2022.
- Sotyktu [summary of product characteristics]. Dublin, Ireland: Bristol Myers Squibb EEIG; December 2023.
- Papp KA, Lebwohl MG, Puig L, et al. Long-term efficacy and safety of risankizumab for the treatment of moderate-to-severe plaque psoriasis: interim analysis of the LIMMitless open-label extension trial beyond 3 years of follow-up. Br J Dermatol. 2021;185(6):1135–1145.
- Warren RB, Armstrong AW, Gooderham M, et al. Deucravacitinib, an oral, selective tyrosine kinase 2 inhibitor, in moderate to severe plaque psoriasis: 52-week efficacy results from the phase 3 POETYK PSO-1 and POETYK PSO-2 trials [oral presentation]. Presented at: 30th European Academy of Dermatology and Venereology Congress; September 29–October 2, 2021.
- Langley RG, Elewski BE, Lebwohl M, et al. Secukinumab in plaque psoriasis–results of two Phase 3 trials. N Engl J Med. 2014;371(4):
326–338. - Reich K, Warren RB, Iversen L, et al. Long-term efficacy and safety of tildrakizumab for moderate-to-severe psoriasis: pooled analyses of two randomized phase III clinical trials (reSURFACE 1 and reSURFACE 2) through 148 weeks. Br J Dermatol. 2020;182(3):605–617.
- Reich K, Gordon KB, Strober BE, et al. Five-year maintenance of clinical response and health-related quality of life improvements in patients with moderate-to-severe psoriasis treated with guselkumab: results from VOYAGE 1 and VOYAGE 2. Br J Dermatol. 2021;185(6):1146–1159.
- Reich K, Armstrong AW, Foley P, et al. Efficacy and safety of guselkumab, an anti-interleukin-23 monoclonal antibody, compared with adalimumab for the treatment of patients with moderate to severe psoriasis with randomized withdrawal and retreatment: results from the Phase III, double-blind, placebo- and active comparator-controlled VOYAGE 2 trial. J Am Acad Dermatol. 2017;76(3):418–431.