-
Multiple imputation for longitudinal data: A tutorial
Authors:
Rushani Wijesuriya,
Margarita Moreno-Betancur,
John B Carlin,
Ian R White,
Matteo Quartagno,
Katherine J Lee
Abstract:
Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as…
▽ More
Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as participation often becomes harder to maintain over time. Multiple imputation (MI) is widely used to handle missing data in such studies. When using MI, it is important that the imputation model is compatible with the proposed analysis model. In a longitudinal analysis, this implies that the clustering considered in the analysis model should be reflected in the imputation process. Several MI approaches have been proposed to impute incomplete longitudinal data, such as treating repeated measurements of the same variable as distinct variables or using generalized linear mixed imputation models. However, the uptake of these methods has been limited, as they require additional data manipulation and use of advanced imputation procedures. In this tutorial, we review the available MI approaches that can be used for handling incomplete longitudinal data, including where individuals are clustered within higher-level clusters. We illustrate implementation with replicable R and Stata code using a case study from the Childhood to Adolescence Transition Study.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Applying the estimands framework to non-inferiority trials: guidance on choice of hypothetical estimands for non-adherence and comparison of estimation methods
Authors:
Katy E Morgan,
Ian R White,
Clémence Leyrat,
Simon Stanworth,
Brennan C Kahan
Abstract:
A common concern in non-inferiority (NI) trials is that non adherence due, for example, to poor study conduct can make treatment arms artificially similar. Because intention to treat analyses can be anti-conservative in this situation, per protocol analyses are sometimes recommended. However, such advice does not consider the estimands framework, nor the risk of bias from per protocol analyses. We…
▽ More
A common concern in non-inferiority (NI) trials is that non adherence due, for example, to poor study conduct can make treatment arms artificially similar. Because intention to treat analyses can be anti-conservative in this situation, per protocol analyses are sometimes recommended. However, such advice does not consider the estimands framework, nor the risk of bias from per protocol analyses. We therefore sought to update the above guidance using the estimands framework, and compare estimators to improve on the performance of per protocol analyses. We argue the main threat to validity of NI trials is the occurrence of trial specific intercurrent events (IEs), that is, IEs which occur in a trial setting, but would not occur in practice. To guard against erroneous conclusions of non inferiority, we suggest an estimand using a hypothetical strategy for trial specific IEs should be employed, with handling of other non trial specific IEs chosen based on clinical considerations. We provide an overview of estimators that could be used to estimate a hypothetical estimand, including inverse probability weighting (IPW), and two instrumental variable approaches (one using an informative Bayesian prior on the effect of standard treatment, and one using a treatment by covariate interaction as an instrument). We compare them, using simulation in the setting of all or nothing compliance in two active treatment arms, and conclude both IPW and the instrumental variable method using a Bayesian prior are potentially useful approaches, with the choice between them depending on which assumptions are most plausible for a given trial.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Estimation of treatment policy estimands for continuous outcomes using off treatment sequential multiple imputation
Authors:
Thomas Drury,
Juan J Abellan,
Nicky Best,
Ian R. White
Abstract:
The estimands framework outlined in ICH E9 (R1) describes the components needed to precisely define the effects to be estimated in clinical trials, which includes how post-baseline "intercurrent" events (IEs) are to be handled. In late-stage clinical trials, it is common to handle intercurrent events like "treatment discontinuation" using the treatment policy strategy and target the treatment effe…
▽ More
The estimands framework outlined in ICH E9 (R1) describes the components needed to precisely define the effects to be estimated in clinical trials, which includes how post-baseline "intercurrent" events (IEs) are to be handled. In late-stage clinical trials, it is common to handle intercurrent events like "treatment discontinuation" using the treatment policy strategy and target the treatment effect on all outcomes regardless of treatment discontinuation. For continuous repeated measures, this type of effect is often estimated using all observed data before and after discontinuation using either a mixed model for repeated measures (MMRM) or multiple imputation (MI) to handle any missing data. In basic form, both of these estimation methods ignore treatment discontinuation in the analysis and therefore may be biased if there are differences in patient outcomes after treatment discontinuation compared to patients still assigned to treatment, and missing data being more common for patients who have discontinued treatment. We therefore propose and evaluate a set of MI models that can accommodate differences between outcomes before and after treatment discontinuation. The models are evaluated in the context of planning a phase 3 trial for a respiratory disease. We show that analyses ignoring treatment discontinuation can introduce substantial bias and can sometimes underestimate variability. We also show that some of the MI models proposed can successfully correct the bias but inevitably lead to increases in variance. We conclude that some of the proposed MI models are preferable to the traditional analysis ignoring treatment discontinuation, but the precise choice of MI model will likely depend on the trial design, disease of interest and amount of observed and missing data following treatment discontinuation.
△ Less
Submitted 25 August, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Use of multiple covariates in assessing treatment-effect modifiers: A methodological review of individual participant data meta-analyses
Authors:
Peter J Godolphin,
Nadine Marlin,
Chantelle Cornett,
David J Fisher,
Jayne F Tierney,
Ian R White,
Ewelina Rogozińska
Abstract:
Individual participant data (IPD) meta-analyses of randomised trials are considered a reliable way to assess participant-level treatment effect modifiers but may not make the best use of the available data. Traditionally, effect modifiers are explored one covariate at a time, which gives rise to the possibility that evidence of treatment-covariate interaction may be due to confounding from a diffe…
▽ More
Individual participant data (IPD) meta-analyses of randomised trials are considered a reliable way to assess participant-level treatment effect modifiers but may not make the best use of the available data. Traditionally, effect modifiers are explored one covariate at a time, which gives rise to the possibility that evidence of treatment-covariate interaction may be due to confounding from a different, related covariate. We aimed to evaluate current practice when estimating treatment-covariate interactions in IPD meta-analysis, specifically focusing on involvement of additional covariates in the models. We reviewed 100 IPD meta-analyses of randomised trials, published between 2015 and 2020, that assessed at least one treatment-covariate interaction. We identified four approaches to handling additional covariates: (1) Single interaction model (unadjusted): No additional covariates included (57/100 studies); (2) Single interaction model (adjusted): Adjustment for the main effect of at least one additional covariate (35/100); (3) Multiple interactions model: Adjustment for at least one two-way interaction between treatment and an additional covariate (3/100); and (4) Three-way interaction model: Three-way interaction formed between treatment, the additional covariate and the potential effect modifier (5/100). IPD is not being utilised to its fullest extent. In an exemplar dataset, we demonstrate how these approaches can lead to different conclusions. Researchers should adjust for additional covariates when estimating interactions in IPD meta-analysis providing they adjust their main effects, which is already widely recommended. Further, they should consider whether more complex approaches could provide better information on who might benefit most from treatments, improving patient choice and treatment policy and practice.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Phases of methodological research in biostatistics - building the evidence base for new methods
Authors:
Georg Heinze,
Anne-Laure Boulesteix,
Michael Kammer,
Tim P. Morris,
Ian R. White
Abstract:
Although the biostatistical scientific literature publishes new methods at a very high rate, many of these developments are not trustworthy enough to be adopted by the scientific community. We propose a framework to think about how a piece of methodological work contributes to the evidence base for a method. Similarly to the well-known phases of clinical research in drug development, we define fou…
▽ More
Although the biostatistical scientific literature publishes new methods at a very high rate, many of these developments are not trustworthy enough to be adopted by the scientific community. We propose a framework to think about how a piece of methodological work contributes to the evidence base for a method. Similarly to the well-known phases of clinical research in drug development, we define four phases of methodological research. These four phases cover (I) providing logical reasoning and proofs, (II) providing empirical evidence, first in a narrow target setting, then (III) in an extended range of settings and for various outcomes, accompanied by appropriate application examples, and (IV) investigations that establish a method as sufficiently well-understood to know when it is preferred over others and when it is not. We provide basic definitions of the four phases but acknowledge that more work is needed to facilitate unambiguous classification of studies into phases. Methodological developments that have undergone all four proposed phases are still rare, but we give two examples with references. Our concept rebalances the emphasis to studies in phase III and IV, i.e., carefully planned methods comparison studies and studies that explore the empirical properties of existing methods in a wider range of problems.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Using modified intention-to-treat as a principal stratum estimator for failure to initiate treatment
Authors:
Brennan C Kahan,
Ian R White,
Mark Edwards,
Michael O Harhay
Abstract:
Background: A common intercurrent event affecting many trials is when some participants do not begin their assigned treatment. Many trials use a modified intention-to-treat (mITT) approach, whereby participants who do not initiate treatment are excluded from the analysis. However, it is not clear the estimand being targeted by such an approach or the assumptions necessary for it to be unbiased.…
▽ More
Background: A common intercurrent event affecting many trials is when some participants do not begin their assigned treatment. Many trials use a modified intention-to-treat (mITT) approach, whereby participants who do not initiate treatment are excluded from the analysis. However, it is not clear the estimand being targeted by such an approach or the assumptions necessary for it to be unbiased.
Methods: We demonstrate that a mITT analysis which excludes participants who do not begin treatment is estimating a principal stratum estimand (i.e. the treatment effect in the subpopulation of participants who would begin treatment, regardless of which arm they were assigned to). The mITT estimator is unbiased for the principal stratum estimand under the assumption that the intercurrent event is not affected by the assigned treatment arm, that is, participants who initiate treatment in one arm would also do so in the other arm.
Results: We identify two key criteria in determining whether the mITT estimator is likely to be unbiased: first, we must be able to measure the participants in each treatment arm who experience the intercurrent event, and second, the assumption that treatment allocation will not affect whether the participant begins treatment must be reasonable. Most double-blind trials will satisfy these criteria, and we provide an example of an open-label trial where these criteria are likely to be satisfied as well.
Conclusions: A modified intention-to-treat analysis which excludes participants who do not begin treatment can be an unbiased estimator for the principal stratum estimand. Our framework can help identify when the assumptions for unbiasedness are likely to hold, and thus whether modified intention-to-treat is appropriate or not.
△ Less
Submitted 30 January, 2023; v1 submitted 8 June, 2022;
originally announced June 2022.
-
A comparison of strategies for selecting auxiliary variables for multiple imputation
Authors:
Rheanna M. Mainzer,
Cattram D. Nguyen,
John B. Carlin,
Margarita Moreno-Betancur,
Ian R. White,
Katherine J. Lee
Abstract:
Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include in the imputation model is not always straightforward. Including too few may lead to important information being discarded, but including too many can cause problems with convergence…
▽ More
Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include in the imputation model is not always straightforward. Including too few may lead to important information being discarded, but including too many can cause problems with convergence of the estimation procedures for imputation models. Several data-driven auxiliary variable selection strategies have been proposed. This paper uses a simulation study and a case study to provide a comprehensive comparison of the performance of eight auxiliary variable selection strategies, with the aim of providing practical advice to users of MI. A complete case analysis and an MI analysis with all auxiliary variables included in the imputation model (the full model) were also performed for comparison. Our simulation study results suggest that the full model outperforms all auxiliary variable selection strategies, providing further support for adopting an inclusive auxiliary variable strategy where possible. Auxiliary variable selection using the Least Absolute Selection and Shrinkage Operator (LASSO) was the best performing auxiliary variable selection strategy overall and is a promising alternative when the full model fails. All MI analysis strategies that we were able to apply to the case study led to similar estimates.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Improving clinical trial interpretation with ACCEPT analyses
Authors:
Michelle N. Clements,
Ian R. White,
Andrew J. Copas,
Victoria Cornelius,
Suzie Cro,
David T Dunn,
Matteo Quartagno,
Rebecca M. Turner,
Conor D. Tweed,
A. Sarah Walker
Abstract:
Effective decision making from randomised controlled clinical trials relies on robust interpretation of the numerical results. However, the language we use to describe clinical trials can cause confusion both in trial design and in comparing results across trials. ACceptability Curve Estimation using Probability Above Threshold (ACCEPT) aids comparison between trials (even where of different desig…
▽ More
Effective decision making from randomised controlled clinical trials relies on robust interpretation of the numerical results. However, the language we use to describe clinical trials can cause confusion both in trial design and in comparing results across trials. ACceptability Curve Estimation using Probability Above Threshold (ACCEPT) aids comparison between trials (even where of different designs) by harmonising reporting of results, acknowledging different interpretations of the results may be valid in different situations, and moving the focus from comparison to a pre-specified value to interpretation of the trial data. ACCEPT can be applied to historical trials or incorporated into statistical analysis plans for future analyses. An online tool enables ACCEPT on up to three trials simultaneously.
△ Less
Submitted 15 June, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Handling missing data when estimating causal effects with Targeted Maximum Likelihood Estimation
Authors:
S. Ghazaleh Dashti,
Katherine J. Lee,
Julie A. Simpson,
Ian R. White,
John B. Carlin,
Margarita Moreno-Betancur
Abstract:
Targeted Maximum Likelihood Estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate eight missing data methods in this context: complete-case analysis, extended TMLE incorporating outco…
▽ More
Targeted Maximum Likelihood Estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate eight missing data methods in this context: complete-case analysis, extended TMLE incorporating outcome-missingness model, missing covariate missing indicator method, five multiple imputation (MI) approaches using parametric or machine-learning models. Six scenarios were considered, varying in exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/non-linear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a non-linear term. When choosing a method to handle missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and non-linearities is expected to perform well.
△ Less
Submitted 3 May, 2024; v1 submitted 9 December, 2021;
originally announced December 2021.
-
Covariate adjustment in randomised trials: canonical link functions protect against model mis-specification
Authors:
Ian R. White,
Tim P Morris,
Elizabeth Williamson
Abstract:
Covariate adjustment has the potential to increase power in the analysis of randomised trials, but mis-specification of the adjustment model could cause error. We explore what error is possible when the adjustment model omits a covariate by randomised treatment interaction, in a setting where the covariate is perfectly balanced between randomised treatments. We use mathematical arguments and analy…
▽ More
Covariate adjustment has the potential to increase power in the analysis of randomised trials, but mis-specification of the adjustment model could cause error. We explore what error is possible when the adjustment model omits a covariate by randomised treatment interaction, in a setting where the covariate is perfectly balanced between randomised treatments. We use mathematical arguments and analyses of single hypothetical data sets.
We show that analysis by a generalised linear model with the canonical link function leads to no error under the null -- that is, if treatment effect is truly zero under the adjusted model then it is also zero under the unadjusted model. However, using non-canonical link functions does not give this property and leads to potentially important error under the null. The error is present even in large samples and hence constitutes bias.
We conclude that covariate adjustment analyses of randomised trials should avoid non-canonical links. If a marginal risk difference is the target of estimation then this should not be estimated using an identity link; alternative preferable methods include standardisation and inverse probability of treatment weighting.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Planning a method for covariate adjustment in individually-randomised trials: a practical guide
Authors:
Tim P. Morris,
A. Sarah Walker,
Elizabeth J. Williamson,
Ian R. White
Abstract:
Background: It has long been advised to account for baseline covariates in the analysis of confirmatory randomised trials, with the main statistical justifications being that this increases power and, when a randomisation scheme balanced covariates, permits a valid estimate of experimental error. There are various methods available to account for covariates but it is not clear how to choose among…
▽ More
Background: It has long been advised to account for baseline covariates in the analysis of confirmatory randomised trials, with the main statistical justifications being that this increases power and, when a randomisation scheme balanced covariates, permits a valid estimate of experimental error. There are various methods available to account for covariates but it is not clear how to choose among them. Methods: Taking the perspective of writing a statistical analysis plan, we consider how to choose between the three most promising broad approaches: direct adjustment, standardisation and inverse-probability-of-treatment weighting. Results: The three approaches are similar in being asymptotically efficient, in losing efficiency with mis-specified covariate functions, and in handling designed balance. If a marginal estimand is targeted (for example, a risk difference or survival difference), then direct adjustment should be avoided because it involves fitting non-standard models that are subject to convergence issues. Convergence is most likely with IPTW. Robust standard errors used by IPTW are anti-conservative at small sample sizes. All approaches can use similar methods to handle missing covariate data. With missing outcome data, each method has its own way to estimate a treatment effect in the all-randomised population. We illustrate some issues in a reanalysis of GetTested, a randomised trial designed to assess the effectiveness of an electonic sexually-transmitted-infection testing and results service. Conclusions: No single approach is always best: the choice will depend on the trial context. We encourage trialists to consider all three methods more routinely.
△ Less
Submitted 8 December, 2021; v1 submitted 13 July, 2021;
originally announced July 2021.
-
Introducing the treatment hierarchy question in network meta-analysis
Authors:
Georgia Salanti,
Adriani Nikolakopoulou,
Orestis Efthimou,
Dimitris Mavridis,
Matthias Egger,
Ian R. White
Abstract:
Background: Comparative effectiveness research using network meta-analysis can present a hierarchy of competing treatments, from the least to most preferable option. However, the research question associated with the hierarchy of multiple interventions is never clearly defined in published reviews. Methods and Results: We introduce the notion of a treatment hierarchy question that describes the cr…
▽ More
Background: Comparative effectiveness research using network meta-analysis can present a hierarchy of competing treatments, from the least to most preferable option. However, the research question associated with the hierarchy of multiple interventions is never clearly defined in published reviews. Methods and Results: We introduce the notion of a treatment hierarchy question that describes the criterion for choosing a specific treatment over one or more competing alternatives. For example, stakeholders might ask which treatment is most likely to improve mean survival by at least 2 years or which treatment is associated with the longest mean survival. The answers to these two questions are not necessarily the same. We discuss the most commonly used ranking metrics (quantities that describe or compare the estimated treatment-specific effects), how the metrics produce a treatment hierarchy and the type of treatment hierarchy question that each metric can answer. We show that the ranking metrics encompass the uncertainty in the estimation of the treatment effects in different ways, which results in different treatment hierarchies. Conclusions: Network meta-analyses that aim to rank treatments should state in the protocol the treatment hierarchy question they aim to address and employ the appropriate ranking metric to answer it.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
The design and statistical aspects of VIETNARMS: a strategic post-licensing trial of multiple oral direct acting antiviral Hepatitis C treatment strategies in Vietnam
Authors:
L. McCabe,
I. R. White,
N. V. Vinh Chau,
E. Barnes,
S. L. Pett,
G. S. Cooke,
A. S. Walker
Abstract:
Background Achieving hepatitis C elimination is hampered by the costs of treatment and the need to treat hard-to-reach populations. Treatment access could be widened by shortening treatment, but limited research means it is unclear which strategies could achieve sufficiently high cure rates to be acceptable. We present the statistical aspects of a multi-arm trial designed to test multiple strategi…
▽ More
Background Achieving hepatitis C elimination is hampered by the costs of treatment and the need to treat hard-to-reach populations. Treatment access could be widened by shortening treatment, but limited research means it is unclear which strategies could achieve sufficiently high cure rates to be acceptable. We present the statistical aspects of a multi-arm trial designed to test multiple strategies simultaneously with a monitoring mechanism to detect and stop those with unacceptably low cure rates quickly. Methods The VIETNARMS trial will factorially randomise patients to three randomisations. We will use Bayesian monitoring at interim analyses to detect and stop recruitment into unsuccessful strategies, defined as a >0.95 posterior probability of the true cure rate being <90%. Here, we tested the operating characteristics of the stopping guideline, planned the timing of the interim analyses and explored power at the final analysis. Results A beta(4.5, 0.5) prior for the true cure rate produces <0.05 probability of incorrectly stopping a group with true cure rate >90%. Groups with very low cure rates (<60%) are very likely (>0.9 probability) to stop after ~25% patients are recruited. Groups with moderately low cure rates (80%) are likely to stop (0.7 probability) before the end of recruitment. Interim analyses 7, 10, 13 and 18 months after recruitment commences provide good probabilities of stopping inferior groups. For an overall true cure rate of 95%, power is >90% to detect non-inferiority in the regimen and strategy comparisons using 5% and 10% margins respectively, regardless of the control cure rate, and to detect a 5% absolute difference in the ribavirin comparison. Conclusions The operating characteristics of the stopping guideline are appropriate and interim analyses can be timed to detect failing groups at various stages.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Handling an uncertain control group event risk in non-inferiority trials: non-inferiority frontiers and the power-stabilising transformation
Authors:
Matteo Quartagno,
A. Sarah Walker,
Abdel G. Babiker,
Rebecca M. Turner,
Mahesh K. B. Parmar,
Andrew Copas,
Ian R. White
Abstract:
Background. Non-inferiority (NI) trials are increasingly used to evaluate new treatments expected to have secondary advantages over standard of care, but similar efficacy on the primary outcome. When designing a NI trial with a binary primary outcome, the choice of effect measure for the NI margin has an important effect on sample size calculations; furthermore, if the control event risk observed…
▽ More
Background. Non-inferiority (NI) trials are increasingly used to evaluate new treatments expected to have secondary advantages over standard of care, but similar efficacy on the primary outcome. When designing a NI trial with a binary primary outcome, the choice of effect measure for the NI margin has an important effect on sample size calculations; furthermore, if the control event risk observed is markedly different from that assumed, the trial can quickly lose power or the results become difficult to interpret. Methods. We propose a new way of designing NI trials to overcome the issues raised by unexpected control event risks by specifying a NI frontier, i.e. a curve defining the most appropriate non-inferiority margin for each possible value of control event risk. We propose a fixed arcsine difference frontier, the power-stabilising transformation for binary outcomes. We propose and compare three ways of designing a trial using this frontier. Results. Testing and reporting on the arcsine scale leads to results which are challenging to interpret clinically. Working on the arcsine scale generally requires a larger sample size compared to the risk difference scale. Therefore, working on the risk difference scale, modifying the margin after observing the control event risk, might be preferable, as it requires a smaller sample size. However, this approach tends to slightly inflate type I error rate; a solution is to use a lower significance level for testing. When working on the risk ratio scale, the same approach leads to power levels above the nominal one, maintaining type I error under control. Conclusions. Our proposed methods of designing NI trials using power-stabilising frontiers make trial design more resilient to unexpected values of the control event risk, at the only cost of requiring larger sample sizes when the goal is to report results on the risk difference scale.
△ Less
Submitted 1 May, 2019;
originally announced May 2019.
-
Bivariate network meta-analysis for surrogate endpoint evaluation
Authors:
Sylwia Bujkiewicz,
Dan Jackson,
John R Thompson,
Rebecca Turner,
Keith R Abrams,
Ian R White
Abstract:
Surrogate endpoints are very important in regulatory decision-making in healthcare, in particular if they can be measured early compared to the long-term final clinical outcome and act as good predictors of clinical benefit. Bivariate meta-analysis methods can be used to evaluate surrogate endpoints and to predict the treatment effect on the final outcome from the treatment effect measured on a su…
▽ More
Surrogate endpoints are very important in regulatory decision-making in healthcare, in particular if they can be measured early compared to the long-term final clinical outcome and act as good predictors of clinical benefit. Bivariate meta-analysis methods can be used to evaluate surrogate endpoints and to predict the treatment effect on the final outcome from the treatment effect measured on a surrogate endpoint. However, candidate surrogate endpoints are often imperfect, and the level of association between the treatment effects on the surrogate and final outcomes may vary between treatments. This imposes a limitation on the pairwise methods which do not differentiate between the treatments. We develop bivariate network meta-analysis (bvNMA) methods which combine data on treatment effects on the surrogate and final outcomes, from trials investigating heterogeneous treatment contrasts. The bvNMA methods estimate the effects on both outcomes for all treatment contrasts individually in a single analysis. At the same time, they allow us to model the surrogacy patterns across multiple trials (different populations) within a treatment contrast and across treatment contrasts, thus enabling predictions of the treatment effect on the final outcome for a new study in a new population or investigating a new treatment. Modelling assumptions about the between-studies heterogeneity and the network consistency, and their impact on predictions, are investigated using simulated data and an illustrative example in advanced colorectal cancer. When the strength of the surrogate relationships varies across treatment contrasts, bvNMA has the advantage of identifying treatments for which surrogacy holds, thus leading to better predictions.
△ Less
Submitted 24 July, 2018;
originally announced July 2018.
-
Using simulation studies to evaluate statistical methods
Authors:
Tim P Morris,
Ian R White,
Michael J Crowther
Abstract:
Simulation studies are computer experiments that involve creating data by pseudorandom sampling. The key strength of simulation studies is the ability to understand the behaviour of statistical methods because some 'truth' (usually some parameter/s of interest) is known from the process of generating the data. This allows us to consider properties of methods, such as bias. While widely used, simul…
▽ More
Simulation studies are computer experiments that involve creating data by pseudorandom sampling. The key strength of simulation studies is the ability to understand the behaviour of statistical methods because some 'truth' (usually some parameter/s of interest) is known from the process of generating the data. This allows us to consider properties of methods, such as bias. While widely used, simulation studies are often poorly designed, analysed and reported. This tutorial outlines the rationale for using simulation studies and offers guidance for design, execution, analysis, reporting and presentation. In particular, this tutorial provides: a structured approach for planning and reporting simulation studies, which involves defining aims, data-generating mechanisms, estimands, methods and performance measures ('ADEMP'); coherent terminology for simulation studies; guidance on coding simulation studies; a critical discussion of key performance measures and their estimation; guidance on structuring tabular and graphical presentation of results; and new graphical presentations. With a view to describing recent practice, we review 100 articles taken from Volume 34 of Statistics in Medicine that included at least one simulation study and identify areas for improvement.
△ Less
Submitted 5 December, 2018; v1 submitted 8 December, 2017;
originally announced December 2017.
-
A causal modelling framework for reference-based imputation and tipping point analysis
Authors:
Ian R. White,
Royes Joseph,
Nicky Best
Abstract:
We consider estimating the "de facto" or effectiveness estimand in a randomised placebo-controlled or standard-of-care-controlled drug trial with quantitative outcome, where participants who discontinue an investigational treatment are not followed up thereafter. Carpenter et al (2013) proposed reference-based imputation methods which use a reference arm to inform the distribution of post-disconti…
▽ More
We consider estimating the "de facto" or effectiveness estimand in a randomised placebo-controlled or standard-of-care-controlled drug trial with quantitative outcome, where participants who discontinue an investigational treatment are not followed up thereafter. Carpenter et al (2013) proposed reference-based imputation methods which use a reference arm to inform the distribution of post-discontinuation outcomes and hence to inform an imputation model. However, the reference-based imputation methods were not formally justified. We present a causal model which makes an explicit assumption in a potential outcomes framework about the maintained causal effect of treatment after discontinuation. We show that the "jump to reference", "copy reference" and "copy increments in reference" reference-based imputation methods, with the control arm as the reference arm, are special cases of the causal model with specific assumptions about the causal treatment effect. Results from simulation studies are presented. We also show that the causal model provides a flexible and transparent framework for a tipping point sensitivity analysis in which we vary the assumptions made about the causal effect of discontinued treatment. We illustrate the approach with data from two longitudinal clinical trials.
△ Less
Submitted 12 May, 2017;
originally announced May 2017.
-
A mean score method for sensitivity analysis to departures from the missing at random assumption in randomised trials
Authors:
Ian R. White,
James Carpenter,
Nicholas J. Horton
Abstract:
Most analyses of randomised trials with incomplete outcomes make untestable assumptions and should therefore be subjected to sensitivity analyses. However, methods for sensitivity analyses are not widely used. We propose a mean score approach for exploring global sensitivity to departures from missing at random or other assumptions about incomplete outcome data in a randomised trial. We assume a s…
▽ More
Most analyses of randomised trials with incomplete outcomes make untestable assumptions and should therefore be subjected to sensitivity analyses. However, methods for sensitivity analyses are not widely used. We propose a mean score approach for exploring global sensitivity to departures from missing at random or other assumptions about incomplete outcome data in a randomised trial. We assume a single outcome analysed under a generalised linear model. One or more sensitivity parameters, specified by the user, measure the degree of departure from missing at random in a pattern mixture model. Advantages of our method are that its sensitivity parameters are relatively easy to interpret and so can be elicited from subject matter experts; it is fast and non-stochastic; and its point estimate, standard error and confidence interval agree perfectly with standard methods when particular values of the sensitivity parameters make those standard methods appropriate. We illustrate the method using data from a mental health trial.
△ Less
Submitted 2 May, 2017;
originally announced May 2017.
-
Multiple imputation for multilevel data with continuous and binary variables
Authors:
Vincent Audigier,
Ian R. White,
Shahab Jolani,
Thomas P. A. Debray,
Matteo Quartagno,
James Carpenter,
Stef van Buuren,
Matthieu Resche-Rigon
Abstract:
We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing.
The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real dataset comprising multiple studies. Simulations are reproducible. The comparisons show why these multiple imputation method…
▽ More
We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing.
The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real dataset comprising multiple studies. Simulations are reproducible. The comparisons show why these multiple imputation methods are the most appropriate to handle missing values in a multilevel setting and why their relative performances can vary according to the missing data pattern, the multilevel structure and the type of missing variables.
This study shows that valid inferences can only be obtained if the dataset gathers a large number of clusters. In addition, it highlights that heteroscedastic MI methods provide more accurate inferences than homoscedastic methods, which should be reserved for data with few individuals per cluster. Finally, the method of Quartagno and Carpenter (2016a) appears generally accurate for binary variables, the method of Resche-Rigon and White (2016) with large clusters, and the approach of Jolani et al. (2015) with small clusters.
△ Less
Submitted 27 November, 2017; v1 submitted 3 February, 2017;
originally announced February 2017.
-
Propensity score analysis with partially observed confounders: how should multiple imputation be used?
Authors:
Clemence Leyrat,
Shaun R. Seaman,
Ian R. White,
Ian Douglas,
Liam Smeeth,
Joseph Kim,
Matthieu Resche-Rigon,
James R. Carpenter,
Elizabeth J. Williamson
Abstract:
Inverse probability of treatment weighting (IPTW) is a popular propensity score (PS)-based approach to estimate causal effects in observational studies at risk of confounding bias. A major issue when estimating the PS is the presence of partially observed covariates. Multiple imputation (MI) is a natural approach to handle missing data on covariates, but its use in the PS context raises three impo…
▽ More
Inverse probability of treatment weighting (IPTW) is a popular propensity score (PS)-based approach to estimate causal effects in observational studies at risk of confounding bias. A major issue when estimating the PS is the presence of partially observed covariates. Multiple imputation (MI) is a natural approach to handle missing data on covariates, but its use in the PS context raises three important questions: (i) should we apply Rubin's rules to the IPTW treatment effect estimates or to the PS estimates themselves? (ii) does the outcome have to be included in the imputation model? (iii) how should we estimate the variance of the IPTW estimator after MI? We performed a simulation study focusing on the effect of a binary treatment on a binary outcome with three confounders (two of them partially observed). We used MI with chained equations to create complete datasets and compared three ways of combining the results: combining treatment effect estimates (MIte); combining the PS across the imputed datasets (MIps); or combining the PS parameters and estimating the PS of the average covariates across the imputed datasets (MIpar). We also compared the performance of these methods to complete case (CC) analysis and the missingness pattern (MP) approach, a method which uses a different PS model for each pattern of missingness. We also studied empirically the consistency of these 3 MI estimators. Under a missing at random (MAR) mechanism, CC and MP analyses were biased in most cases when estimating the marginal treatment effect, whereas MI approaches had good performance in reducing bias as long as the outcome was included in the imputation model. However, only MIte was unbiased in all the studied scenarios and Rubin's rules provided good variance estimates for MIte.
△ Less
Submitted 19 August, 2016;
originally announced August 2016.
-
Multiple imputation of covariates by fully conditional specification: accommodating the substantive model
Authors:
Jonathan W. Bartlett,
Shaun R. Seaman,
Ian R. White,
James R. Carpenter
Abstract:
Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation (MI). Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of MI may impute covariates…
▽ More
Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation (MI). Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of MI may impute covariates from models that are incompatible with such substantive models. We show how imputation by fully conditional specification, a popular approach for performing MI, can be modified so that covariates are imputed from models which are compatible with the substantive model. We investigate through simulation the performance of this proposal, and compare it to existing approaches. Simulation results suggest our proposal gives consistent estimates for a range of common substantive models, including models which contain non-linear covariate effects or interactions, provided data are missing at random and the assumed imputation models are correctly specified and mutually compatible.
△ Less
Submitted 17 January, 2013; v1 submitted 25 October, 2012;
originally announced October 2012.