SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Imputation of Missing Data through Bayesian Approach
Pratibha Jalui
Cytel Statistical Software & Services Pvt. Ltd, Pune
Email: pratibha.jalui@cytel.com
Reetabrata Bhattacharyya
Tata Consultancy Services Limited, Mumbai
Email: reetabrata.b@tcs.com
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 1 / 24
Overview
1 Introduction and Background
2 Mechanisms
3 Motivation
4 Objective
5 Data and Methods
6 Results
7 Conclusion and Discussion
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 2 / 24
Why talk about Missing Data?
Randomized clinical trials - primary tool for
evaluating new medical interventions.
More than $7 billion spent every year in evaluating
drugs, devices, and biologists of which a substantial
percentage of outcomes of interest is often missing.
Missingness reduces the benefit provided by
randomization - introduces potential biases in
comparison of the treatment groups.
As large as 65% of articles in PubMed journals do
not report the handling of Missing data.
Health Authorities encourage better approaches to
handle missing data
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 3 / 24
Why talk about Missing Data?
Randomized clinical trials - primary tool for
evaluating new medical interventions.
More than $7 billion spent every year in evaluating
drugs, devices, and biologists of which a substantial
percentage of outcomes of interest is often missing.
Missingness reduces the benefit provided by
randomization - introduces potential biases in
comparison of the treatment groups.
As large as 65% of articles in PubMed journals do
not report the handling of Missing data.
Health Authorities encourage better approaches to
handle missing data
"The only really good solution to the missing data problem is not to have any" - Paul Allison
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 3 / 24
How do we define Missing Data?
Missing Data
Data that were planned to be recorded but are not available.
Broadly two types of missing data which are as follows:
Monotone missing data
All data for a subject are missing after a certain time-point.
Serious problem in interpreting the results of a trial.
Non-monotone or intermediate missing data
A subject misses a visit but contributes data at later visits.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 4 / 24
Types of Missingness
1. Missing Completely at Random (MCAR)
Missingness is independent on observed and unobserved data.
Example:
• Patient moving to another city for non-health reasons. Patients who drop
out from a study for this reason could be considered a random and
representative sample from the total study population.
2. Missing at Random (MAR)
Missingness depends on observed data.
Example:
• Dropout due to previous lack of efficacy could be MAR, because in some
sense predictable from the observed data in the model.
• Men may be more likely to decline to answer some questions than women.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 5 / 24
Types of Missingness
3. Missing Not At Random (MNAR)
Missingness is not independent in unobserved data, even after
accounting form the observed data.
Difficult to model
Example:
• It may happen that after a series of visits with good outcome, a patient
drops out due to lack of efficacy. In this situation the analysis model based
on the observed data, including relevant covariates, is likely to continue to
predict a good outcome, but it is usually unreasonable to expect the patient
to continue to derive benefit from treatment.
• Individuals with very high incomes are more likely to decline to answer
questions about their own income.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 6 / 24
The Effect of Missing Values on Analysis and Interpretation
The following problems may affect the interpretation of the trial results when
some missing data are present.
Power and Variability
• Power of a trial will increase if the sample size is increased or if the
variability of the outcomes is reduced.
Bias
• Risk of bias in the estimation on the treatment effect from the observed
data depends upon the relationship between missingness, treatment and
outcome.
• Type of bias that can critically affect interpretation will depend upon
whether the objective of the study is to show a difference or demonstrate
non-inferiority/equivalence.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 7 / 24
Goals of Statistical Analysis with Missing Data
Goals of Statistical Analysis:
Minimize bias
Maximize use of available information
Obtain appropriate estimates of uncertainty
Key points to keep in mind:
Research question (i.e. the hypothesis under investigation)
Information in the observed data
Reason(s) for missing data
As statisticians/programmers we need to:
Consult with Investigators to design to minimize missing data/ infor-
mation, postulate plausible missingness, perform valid analysis and
interpret the results.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 8 / 24
What do the Regulatory Bodies (FDA/EMEA) recommend?
Avoid Missing Data wherever possible
Protocol to address potential impact and treatment of anticipated missing
data
Design strategies to minimize treatment and analysis dropouts
Continue to collect information on key outcomes on participants who
discontinue -record and use it for analysis
Set a minimum rate of completeness for the primary outcome(s), based
on similar past trials
Specify Statistical methods and assumptions for handling missing data in
protocols such a way that is understood by clinicians
Focused efforts on training staff
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 9 / 24
What do the Regulatory Bodies (FDA/EMEA) recommend?
Avoid Single imputation methods like LOCF and BOCF as the primary
approach to the treatment of missing data unless underlying assumptions
are scientifically justified.
Parametric models, random effects models to be used with caution -all
assumptions clearly stated - accompanied by goodness-of-fit procedures.
Weighted generalized estimating equations methods be more widely
used as alternative to parametric modeling.
When substantial missing data are anticipated, auxiliary information
should be collected.
Sensitivity analyses mandated as part of the primary reporting of findings
from clinical trials
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 10 / 24
Treatments for Missing Data: Traditional Approach
List wise Deletion
• Omit cases with missing data and run analyses on what remains.
Simple Imputation Method - Last Observation Carried Forward
• Subject’s missing responses is equal to their last observed response and it
is developed under Missing Completely At Random (MCAR) framework
• Usually used in longitudinal (repeated measures) studies of continuous
outcomes
Simple Imputation Method - Baseline Observation Carried Forward
• Similar to LOCF but here we assume a patient’s missing responses is
equal to their baseline observed response.
Empirically developed models
• Unconditional and conditional mean imputation
• Best or worst case imputation
• Regression methods and Hot-deck imputation
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 11 / 24
Treatments for Missing Data: Modern Approach
Full Information Maximum Likelihood (FIML) model
• Uses pragmatic missing data estimation approach for structural equation
modeling
• Produces unbiased parameter estimates and standard errors under MAR
and MCAR.
• Unlike the maximum likelihood method FIML uses all available
information in all observations.
Mixed-Effect Model Repeated Measure (MMRM) model
• Applies with a Restricted Maximum Likelihood solution to study
longitudinal (repeated measures) analyses under MAR assumption.
• Missing data are not explicitly imputed. No effect on other scores from
that same patient.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 12 / 24
Objective
1 To examine the multiple imputation(MI) approach, specifically, Bayesian
Markov Chain Monte Carlo (MCMC) random sampling method for the
analysis of incomplete data.
2 To compare the performance of original data using last observation
carried forward (LOCF) and baseline observation carried
forward(BOCF) imputation approaches versus MI through Bayesian
MCMC random sampling method.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 13 / 24
Data : Analytical Background
Testing of treatment (Hypothesis of Interest)
To evaluate the efficacy of Treatment A at Week-16 for change in
Vitreous Haze (VH) score.
Statistical Analysis Plan
The change from baseline to Week-16 in VH score are compared
between treatment groups using an Analysis of Covariance (ANCOVA)
model.
The model are included the fixed categorical effect of treatment groups,
visits and treatment-by visit interaction as well as the fixed continuous
covariate of baseline VH.
The model provides adjusted least square (LS) means estimates at week
16 for both the treatment groups, difference between the means,
corresponding standard error (SE), confidence interval (CI) and p-value.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 14 / 24
Data: Simulation
Simulated hypothetical clinical trial efficacy dataset as an input in order
to perform the MCMC method for missing data imputation.
100 patients are considered with an amount of missing data similar to the
one observed in our real data set.
Missing data pattern is randomly created.
This is an exhaustive simulation study just to demonstrate the application
of Bayesian method for imputing missing value.
A data set simulation is done to obtain a more complete comparison of
the three methods (BOCF, LOCF with MI).
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 15 / 24
Methods: Analytical Background
Bayesian Approach
In Bayesian inference, information about unknown parameters is
expressed in the form of a posterior probability distribution.
Markov Chain Monte Carlo (MCMC)
A Markov chain is a sequence of random variables in which the
distribution of each element depends on the value of the previous one.
Through MCMC, we can simulate the entire joint posterior distribution
of the unknown quantities and obtain simulation based estimates of
posterior parameters of interest.
It is a collection of methods for simulating random draws from
nonstandard distributions via Markov chains.
By repeatedly simulating steps of the chain, it simulates draws from the
distribution of interest.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 16 / 24
Method: Data Augmentation (DA) Algorithm
Goal
To have the iterates converge to the stationary distribution.
To simulate an approximately independent draw of the missing values.
Assumption
Assuming that the data are from a multivariate normal distribution.
Data augmentation is applied to Bayesian inference with missing data by
repeating the following steps:
Step - 1
The imputation I-step:
To estimate mean vector and covariance matrix.
I-step simulates the missing values for each observation independently.
The I-step draws values for Yi(mis) from a conditional distribution Yi(mis)
given Yi(obs) .
where, Yi(mis): the variables with missing values for observation i ;
Yi(obs): the variables with observed values for observation i .
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 17 / 24
Method: Data Augmentation (DA) Algorithm
Step - 2
The posterior P-step:
P-step simulates the posterior population mean vector and covariance
matrix from the complete sample estimates by using non-informative
prior.
These new estimates are then used in the I-step.
Iterates converge to their stationary distribution and then to simulate an
approximately independent draw of the missing values.
Summary
Current parameter estimate θ(t) at tth iteration.
I-step draws Y
(t+1)
mis from P(Ymis|Yobs, θ(t))
P-step draws θ(t+1) from P(θ(t)|Yobs, Ymis)
This creates a Markov chain (Y
(1)
mis, θ(1)) , (Y
(2)
mis, θ(2)),........
It converges in distribution to P(Ymis, θ|Yobs).
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 18 / 24
Method: Application in SAS
Multiple Imputation step 1
MCMC method used in conjunction with the IMPUTE=MONOTONE
option to create an imputed data set with a monotone missing pattern.
Variables include treatment group and VH scores at baseline and
post-baseline analysis visits.
This method implies that VH scores are analysed as continuous variables
and treatment group is a dummy variable.
SAS Code
proc mi data=dset1 out=MIstep1 seed=27160 nimpute=1000 noprint ;
mcmc impute=monotone chain=multiple ;
var armn baseline week2 week4 week6 week8 week10
week12 week14 week16;
run;
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 19 / 24
Method: Application in SAS
Multiple Imputation step 2
Missing data are imputed with a regression method by using the
monotone data set from step 1
Variables include treatment group, stratification variables and VH scores
at baseline and post-baseline analysis visits.
This method implies that VH scores are analysed as continuous variables.
Output data set from step 1 (after rounding) is used as input data set for
step 2.
Only 1 imputation in step 2 (for each imputation from step 1).
SAS Code
proc mi data=MIstep1r out=MIstep2 seed=54320 nimpute=1 noprint ;
var armn stratum baseline week2 week4 week6 week8 week10
week12 week14 week16;
class armn stratum;
monotone reg;
run;
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 20 / 24
Result: Tabular representation of efficacy endpoint
Table 1 : Change from baseline in VH Score to Week 16, MITT population
Vitreous Hazre (Miami 9-step scale) Placebo (N=43) Treatment A (N=57)
Baseline
Number 43 57
Mean (SD) 4.47 (1.96) 4.68 (2.49)
Median 5.00 5.00
Min : Max 1.0 : 8.0 1.0 : 8.0
Week 16
Number 28 44
Mean (SD) 4.18 (2.47) 4.64 (2.30)
Median 3.50 5.00
Min : Max 1.0 : 8.0 1.0 : 8.0
Change from Baseline
Number 28 44
Mean (SD) -0.11 (3.58) -0.25 (3.36)
Median 0.50 -1.00
Min : Max -7.0 : 6.0 -7.0 : 6.0
Analysis : Original Data
LS Means (SE) -0.42 (0.414) 0.06 (0.331)
90% CI (-1.103 to 0.262) (-0.485 to 0.604)
LS Mean differences (SE) vs. Placebo 0.48 (0.530)
90% CI (-0.393 to 1.354)
p-value 0.3653
Analysis : BOCF
LS Means (SE) -0.18 (0.340) -0.10 (0.295)
90% CI (-0.739 to 0.380) (-0.590 to 0.383)
LS Mean differences (SE) vs. Placebo 0.08 (0.450)
90% CI (-0.665 to 0.817)
p-value 0.8659
Analysis : LOCF
LS Means (SE) 0.15 (0.329) 0.15 (0.329)
90% CI (-0.395 to 0.689) (-0.197 to 0.745)
LS Mean differences (SE) vs. Placebo
a
0.13 (0.436)
90% CI (-0.591 to 0.845)
p-value 0.7704
Analysis : Imputation (Bayesian)
LS Means (SE) -0.78 (0.380) -0.11 (0.317)
90% CI (-1.527 to -0.335) (-0.729 to 0.515)
LS Mean differences (SE) vs. Placebo 0.67 (0.495)
90% CI (-0.298 to 1.645)
p-value 0.1739
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 21 / 24
Conclusion and Discussion
From Table 1, we see that, LS mean change in VH score, from baseline
to week 16 is higher in the Treatment A compared to the placebo group,
but also tends to statistically significant difference for imputation by
using Bayesian .
Improvement of p-values has been noticed for imputation by using
Bayesian (0.1739) than LOCF (0.7704) & BOCF (0.8659) compared to
original data (0.3653).
Bayesian approach lends itself naturally different choices of prior
distributions encoding assumptions about the missing data process.
It offers possibility of including informative prior information about
missing data process.But models can become computationally
challenging.
The procedure can be used in the data preparation steps before calling
the analysis model to simplify the clinical efficacy data analysis process.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 22 / 24
References
Allison, P.D. (2000). Multiple Imputation for Missing Data: A
Cautionary Tale. Sociological Methods and Research, 28: 301-309.
Barnard J, Rubin DB (1999). Small-Sample Degrees of Freedom with
Multiple Imputation. Biometrika, 86: 948-955.
National Research Council. The Prevention and Treatment of Missing
Data in Clinical Trials. The Panel on Handling Missing Data in Clinical
Trials
Rubin DB (1976). Inference and Missing Data. Biometrika, 63: 581-592.
Rubin DB (1987). Multiple Imputation for Nonresponse in Surveys. John
Wiley & Sons.
Rubin DB (1996). Imputation After 18+ Years. Journal of the American
Statistical Association, 91: 473-489.
Yuan, Yang (2011). Multiple Imputation Using SAS Software. Journal
of Statistical Software, 45(6): 1-25.
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 23 / 24
Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th
- 10th
Oct, 2015 24 / 24

Mais conteúdo relacionado

Mais procurados

Bayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaBayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaPubrica
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...Editor IJCATR
 
Data science in health care
Data science in health careData science in health care
Data science in health careChetan Khanzode
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataCSCJournals
 
IRJET- Disease Prediction and Doctor Recommendation System
IRJET-  	  Disease Prediction and Doctor Recommendation SystemIRJET-  	  Disease Prediction and Doctor Recommendation System
IRJET- Disease Prediction and Doctor Recommendation SystemIRJET Journal
 
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...Allison McCoy
 
A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).Waqas Tariq
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics Arun K
 
Clinical data munging
Clinical data mungingClinical data munging
Clinical data mungingKen Mwai
 
How to extract quantitative data for systematic review and meta analysis - Pu...
How to extract quantitative data for systematic review and meta analysis - Pu...How to extract quantitative data for systematic review and meta analysis - Pu...
How to extract quantitative data for systematic review and meta analysis - Pu...Pubrica
 
Informatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside careInformatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside careMike Hogarth, MD, FACMI, FACP
 
IRJET - An Effective Stroke Prediction System using Predictive Models
IRJET -  	  An Effective Stroke Prediction System using Predictive ModelsIRJET -  	  An Effective Stroke Prediction System using Predictive Models
IRJET - An Effective Stroke Prediction System using Predictive ModelsIRJET Journal
 

Mais procurados (18)

Bayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaBayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - Pubrica
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
 
When to Select Observational Studies Quiz
When to Select Observational Studies QuizWhen to Select Observational Studies Quiz
When to Select Observational Studies Quiz
 
Data science in health care
Data science in health careData science in health care
Data science in health care
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
 
IRJET- Disease Prediction and Doctor Recommendation System
IRJET-  	  Disease Prediction and Doctor Recommendation SystemIRJET-  	  Disease Prediction and Doctor Recommendation System
IRJET- Disease Prediction and Doctor Recommendation System
 
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...
Clinician Satisfaction Before and After Transition from a Basic to a Comprehe...
 
A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics
 
Clinical data munging
Clinical data mungingClinical data munging
Clinical data munging
 
How to extract quantitative data for systematic review and meta analysis - Pu...
How to extract quantitative data for systematic review and meta analysis - Pu...How to extract quantitative data for systematic review and meta analysis - Pu...
How to extract quantitative data for systematic review and meta analysis - Pu...
 
HM404 Ab120916 ch12
HM404 Ab120916 ch12HM404 Ab120916 ch12
HM404 Ab120916 ch12
 
Informatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside careInformatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside care
 
Topic Refinement Quiz
Topic Refinement QuizTopic Refinement Quiz
Topic Refinement Quiz
 
IRJET - An Effective Stroke Prediction System using Predictive Models
IRJET -  	  An Effective Stroke Prediction System using Predictive ModelsIRJET -  	  An Effective Stroke Prediction System using Predictive Models
IRJET - An Effective Stroke Prediction System using Predictive Models
 
Data Extraction
Data ExtractionData Extraction
Data Extraction
 

Destaque

TEST DE LA TARJETA MADRE
TEST DE LA TARJETA MADRE TEST DE LA TARJETA MADRE
TEST DE LA TARJETA MADRE Kelin Arango
 
Nouns, Articles, Quantifiers
Nouns, Articles, QuantifiersNouns, Articles, Quantifiers
Nouns, Articles, QuantifiersDustin French
 
Refactoring -chapter 7,8-
Refactoring -chapter 7,8-Refactoring -chapter 7,8-
Refactoring -chapter 7,8-Kwang Jung Kim
 
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)Sang Don Kim
 
Euroscola solynieve
Euroscola solynieveEuroscola solynieve
Euroscola solynieveweeuroscola
 

Destaque (9)

JULIARESUME11PDF
JULIARESUME11PDFJULIARESUME11PDF
JULIARESUME11PDF
 
(Lisp)
(Lisp)(Lisp)
(Lisp)
 
PAA Presentation 17-4-2010
PAA Presentation 17-4-2010PAA Presentation 17-4-2010
PAA Presentation 17-4-2010
 
TEST DE LA TARJETA MADRE
TEST DE LA TARJETA MADRE TEST DE LA TARJETA MADRE
TEST DE LA TARJETA MADRE
 
Nouns, Articles, Quantifiers
Nouns, Articles, QuantifiersNouns, Articles, Quantifiers
Nouns, Articles, Quantifiers
 
Refactoring -chapter 7,8-
Refactoring -chapter 7,8-Refactoring -chapter 7,8-
Refactoring -chapter 7,8-
 
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)
[Td 2015]개발하기 바쁜데 푸시서버와 메시지큐는 있는거 쓸래요(김영재)
 
Plan ambiental
Plan ambientalPlan ambiental
Plan ambiental
 
Euroscola solynieve
Euroscola solynieveEuroscola solynieve
Euroscola solynieve
 

Semelhante a D1S1T3N4_Pratibha Jalui & Reetabrata Bhattacharyya

Draft AMCP 2006 Model Quality 4-4-06
Draft AMCP 2006 Model Quality 4-4-06Draft AMCP 2006 Model Quality 4-4-06
Draft AMCP 2006 Model Quality 4-4-06Joe Gricar, MS
 
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...CSCJournals
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansBrook White, PMP
 
Biostatistics clinical research & trials
Biostatistics clinical research & trialsBiostatistics clinical research & trials
Biostatistics clinical research & trialseclinicaltools
 
Evidence Based Medicine
Evidence Based MedicineEvidence Based Medicine
Evidence Based MedicineMansij Biswas
 
Clinical Research Informatics (CRI) Year-in-Review 2014
Clinical Research Informatics (CRI) Year-in-Review 2014Clinical Research Informatics (CRI) Year-in-Review 2014
Clinical Research Informatics (CRI) Year-in-Review 2014Peter Embi
 
Integrating PT First CSM 2017
Integrating PT First CSM 2017 Integrating PT First CSM 2017
Integrating PT First CSM 2017 Dr. Chris Stout
 
ICU Patient Deterioration Prediction : A Data-Mining Approach
ICU Patient Deterioration Prediction : A Data-Mining ApproachICU Patient Deterioration Prediction : A Data-Mining Approach
ICU Patient Deterioration Prediction : A Data-Mining Approachcsandit
 
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACHICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACHcscpconf
 
Mie2015 workshop-adherence engaging-publicized
Mie2015 workshop-adherence engaging-publicizedMie2015 workshop-adherence engaging-publicized
Mie2015 workshop-adherence engaging-publicizedPei-Yun Sabrina Hsueh
 
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...NAMSA
 
Big data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleBig data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleHadas Jacoby
 
Unified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on AccuracyUnified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on AccuracyQuahog Life Sciences
 
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxChapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxketurahhazelhurst
 
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...D3 Consutling
 
Data Quality Matters: EHR Data Quality, MACRA, and Improving Healthcare
Data Quality Matters: EHR Data Quality, MACRA, and Improving HealthcareData Quality Matters: EHR Data Quality, MACRA, and Improving Healthcare
Data Quality Matters: EHR Data Quality, MACRA, and Improving HealthcareMike Hogarth, MD, FACMI, FACP
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-finalPeter Embi
 
Data mining for diabetes readmission
Data mining for diabetes readmissionData mining for diabetes readmission
Data mining for diabetes readmissionYi Chun (Nancy) Chien
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...University of Malaya
 
Deciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric testsDeciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric testsRamachandra Barik
 

Semelhante a D1S1T3N4_Pratibha Jalui & Reetabrata Bhattacharyya (20)

Draft AMCP 2006 Model Quality 4-4-06
Draft AMCP 2006 Model Quality 4-4-06Draft AMCP 2006 Model Quality 4-4-06
Draft AMCP 2006 Model Quality 4-4-06
 
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-Statisticians
 
Biostatistics clinical research & trials
Biostatistics clinical research & trialsBiostatistics clinical research & trials
Biostatistics clinical research & trials
 
Evidence Based Medicine
Evidence Based MedicineEvidence Based Medicine
Evidence Based Medicine
 
Clinical Research Informatics (CRI) Year-in-Review 2014
Clinical Research Informatics (CRI) Year-in-Review 2014Clinical Research Informatics (CRI) Year-in-Review 2014
Clinical Research Informatics (CRI) Year-in-Review 2014
 
Integrating PT First CSM 2017
Integrating PT First CSM 2017 Integrating PT First CSM 2017
Integrating PT First CSM 2017
 
ICU Patient Deterioration Prediction : A Data-Mining Approach
ICU Patient Deterioration Prediction : A Data-Mining ApproachICU Patient Deterioration Prediction : A Data-Mining Approach
ICU Patient Deterioration Prediction : A Data-Mining Approach
 
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACHICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH
 
Mie2015 workshop-adherence engaging-publicized
Mie2015 workshop-adherence engaging-publicizedMie2015 workshop-adherence engaging-publicized
Mie2015 workshop-adherence engaging-publicized
 
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...Meta Analysis of Medical Device Data Applications for Designing Studies and R...
Meta Analysis of Medical Device Data Applications for Designing Studies and R...
 
Big data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleBig data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simple
 
Unified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on AccuracyUnified Medical Data Platform focused on Accuracy
Unified Medical Data Platform focused on Accuracy
 
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxChapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
 
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...
Healthcare Conference 2013 : Toekomstvisie op ICT in de gezondheidszorg - pro...
 
Data Quality Matters: EHR Data Quality, MACRA, and Improving Healthcare
Data Quality Matters: EHR Data Quality, MACRA, and Improving HealthcareData Quality Matters: EHR Data Quality, MACRA, and Improving Healthcare
Data Quality Matters: EHR Data Quality, MACRA, and Improving Healthcare
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 
Data mining for diabetes readmission
Data mining for diabetes readmissionData mining for diabetes readmission
Data mining for diabetes readmission
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
 
Deciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric testsDeciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric tests
 

D1S1T3N4_Pratibha Jalui & Reetabrata Bhattacharyya

  • 1. Imputation of Missing Data through Bayesian Approach Pratibha Jalui Cytel Statistical Software & Services Pvt. Ltd, Pune Email: pratibha.jalui@cytel.com Reetabrata Bhattacharyya Tata Consultancy Services Limited, Mumbai Email: reetabrata.b@tcs.com Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 1 / 24
  • 2. Overview 1 Introduction and Background 2 Mechanisms 3 Motivation 4 Objective 5 Data and Methods 6 Results 7 Conclusion and Discussion Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 2 / 24
  • 3. Why talk about Missing Data? Randomized clinical trials - primary tool for evaluating new medical interventions. More than $7 billion spent every year in evaluating drugs, devices, and biologists of which a substantial percentage of outcomes of interest is often missing. Missingness reduces the benefit provided by randomization - introduces potential biases in comparison of the treatment groups. As large as 65% of articles in PubMed journals do not report the handling of Missing data. Health Authorities encourage better approaches to handle missing data Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 3 / 24
  • 4. Why talk about Missing Data? Randomized clinical trials - primary tool for evaluating new medical interventions. More than $7 billion spent every year in evaluating drugs, devices, and biologists of which a substantial percentage of outcomes of interest is often missing. Missingness reduces the benefit provided by randomization - introduces potential biases in comparison of the treatment groups. As large as 65% of articles in PubMed journals do not report the handling of Missing data. Health Authorities encourage better approaches to handle missing data "The only really good solution to the missing data problem is not to have any" - Paul Allison Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 3 / 24
  • 5. How do we define Missing Data? Missing Data Data that were planned to be recorded but are not available. Broadly two types of missing data which are as follows: Monotone missing data All data for a subject are missing after a certain time-point. Serious problem in interpreting the results of a trial. Non-monotone or intermediate missing data A subject misses a visit but contributes data at later visits. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 4 / 24
  • 6. Types of Missingness 1. Missing Completely at Random (MCAR) Missingness is independent on observed and unobserved data. Example: • Patient moving to another city for non-health reasons. Patients who drop out from a study for this reason could be considered a random and representative sample from the total study population. 2. Missing at Random (MAR) Missingness depends on observed data. Example: • Dropout due to previous lack of efficacy could be MAR, because in some sense predictable from the observed data in the model. • Men may be more likely to decline to answer some questions than women. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 5 / 24
  • 7. Types of Missingness 3. Missing Not At Random (MNAR) Missingness is not independent in unobserved data, even after accounting form the observed data. Difficult to model Example: • It may happen that after a series of visits with good outcome, a patient drops out due to lack of efficacy. In this situation the analysis model based on the observed data, including relevant covariates, is likely to continue to predict a good outcome, but it is usually unreasonable to expect the patient to continue to derive benefit from treatment. • Individuals with very high incomes are more likely to decline to answer questions about their own income. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 6 / 24
  • 8. The Effect of Missing Values on Analysis and Interpretation The following problems may affect the interpretation of the trial results when some missing data are present. Power and Variability • Power of a trial will increase if the sample size is increased or if the variability of the outcomes is reduced. Bias • Risk of bias in the estimation on the treatment effect from the observed data depends upon the relationship between missingness, treatment and outcome. • Type of bias that can critically affect interpretation will depend upon whether the objective of the study is to show a difference or demonstrate non-inferiority/equivalence. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 7 / 24
  • 9. Goals of Statistical Analysis with Missing Data Goals of Statistical Analysis: Minimize bias Maximize use of available information Obtain appropriate estimates of uncertainty Key points to keep in mind: Research question (i.e. the hypothesis under investigation) Information in the observed data Reason(s) for missing data As statisticians/programmers we need to: Consult with Investigators to design to minimize missing data/ infor- mation, postulate plausible missingness, perform valid analysis and interpret the results. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 8 / 24
  • 10. What do the Regulatory Bodies (FDA/EMEA) recommend? Avoid Missing Data wherever possible Protocol to address potential impact and treatment of anticipated missing data Design strategies to minimize treatment and analysis dropouts Continue to collect information on key outcomes on participants who discontinue -record and use it for analysis Set a minimum rate of completeness for the primary outcome(s), based on similar past trials Specify Statistical methods and assumptions for handling missing data in protocols such a way that is understood by clinicians Focused efforts on training staff Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 9 / 24
  • 11. What do the Regulatory Bodies (FDA/EMEA) recommend? Avoid Single imputation methods like LOCF and BOCF as the primary approach to the treatment of missing data unless underlying assumptions are scientifically justified. Parametric models, random effects models to be used with caution -all assumptions clearly stated - accompanied by goodness-of-fit procedures. Weighted generalized estimating equations methods be more widely used as alternative to parametric modeling. When substantial missing data are anticipated, auxiliary information should be collected. Sensitivity analyses mandated as part of the primary reporting of findings from clinical trials Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 10 / 24
  • 12. Treatments for Missing Data: Traditional Approach List wise Deletion • Omit cases with missing data and run analyses on what remains. Simple Imputation Method - Last Observation Carried Forward • Subject’s missing responses is equal to their last observed response and it is developed under Missing Completely At Random (MCAR) framework • Usually used in longitudinal (repeated measures) studies of continuous outcomes Simple Imputation Method - Baseline Observation Carried Forward • Similar to LOCF but here we assume a patient’s missing responses is equal to their baseline observed response. Empirically developed models • Unconditional and conditional mean imputation • Best or worst case imputation • Regression methods and Hot-deck imputation Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 11 / 24
  • 13. Treatments for Missing Data: Modern Approach Full Information Maximum Likelihood (FIML) model • Uses pragmatic missing data estimation approach for structural equation modeling • Produces unbiased parameter estimates and standard errors under MAR and MCAR. • Unlike the maximum likelihood method FIML uses all available information in all observations. Mixed-Effect Model Repeated Measure (MMRM) model • Applies with a Restricted Maximum Likelihood solution to study longitudinal (repeated measures) analyses under MAR assumption. • Missing data are not explicitly imputed. No effect on other scores from that same patient. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 12 / 24
  • 14. Objective 1 To examine the multiple imputation(MI) approach, specifically, Bayesian Markov Chain Monte Carlo (MCMC) random sampling method for the analysis of incomplete data. 2 To compare the performance of original data using last observation carried forward (LOCF) and baseline observation carried forward(BOCF) imputation approaches versus MI through Bayesian MCMC random sampling method. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 13 / 24
  • 15. Data : Analytical Background Testing of treatment (Hypothesis of Interest) To evaluate the efficacy of Treatment A at Week-16 for change in Vitreous Haze (VH) score. Statistical Analysis Plan The change from baseline to Week-16 in VH score are compared between treatment groups using an Analysis of Covariance (ANCOVA) model. The model are included the fixed categorical effect of treatment groups, visits and treatment-by visit interaction as well as the fixed continuous covariate of baseline VH. The model provides adjusted least square (LS) means estimates at week 16 for both the treatment groups, difference between the means, corresponding standard error (SE), confidence interval (CI) and p-value. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 14 / 24
  • 16. Data: Simulation Simulated hypothetical clinical trial efficacy dataset as an input in order to perform the MCMC method for missing data imputation. 100 patients are considered with an amount of missing data similar to the one observed in our real data set. Missing data pattern is randomly created. This is an exhaustive simulation study just to demonstrate the application of Bayesian method for imputing missing value. A data set simulation is done to obtain a more complete comparison of the three methods (BOCF, LOCF with MI). Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 15 / 24
  • 17. Methods: Analytical Background Bayesian Approach In Bayesian inference, information about unknown parameters is expressed in the form of a posterior probability distribution. Markov Chain Monte Carlo (MCMC) A Markov chain is a sequence of random variables in which the distribution of each element depends on the value of the previous one. Through MCMC, we can simulate the entire joint posterior distribution of the unknown quantities and obtain simulation based estimates of posterior parameters of interest. It is a collection of methods for simulating random draws from nonstandard distributions via Markov chains. By repeatedly simulating steps of the chain, it simulates draws from the distribution of interest. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 16 / 24
  • 18. Method: Data Augmentation (DA) Algorithm Goal To have the iterates converge to the stationary distribution. To simulate an approximately independent draw of the missing values. Assumption Assuming that the data are from a multivariate normal distribution. Data augmentation is applied to Bayesian inference with missing data by repeating the following steps: Step - 1 The imputation I-step: To estimate mean vector and covariance matrix. I-step simulates the missing values for each observation independently. The I-step draws values for Yi(mis) from a conditional distribution Yi(mis) given Yi(obs) . where, Yi(mis): the variables with missing values for observation i ; Yi(obs): the variables with observed values for observation i . Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 17 / 24
  • 19. Method: Data Augmentation (DA) Algorithm Step - 2 The posterior P-step: P-step simulates the posterior population mean vector and covariance matrix from the complete sample estimates by using non-informative prior. These new estimates are then used in the I-step. Iterates converge to their stationary distribution and then to simulate an approximately independent draw of the missing values. Summary Current parameter estimate θ(t) at tth iteration. I-step draws Y (t+1) mis from P(Ymis|Yobs, θ(t)) P-step draws θ(t+1) from P(θ(t)|Yobs, Ymis) This creates a Markov chain (Y (1) mis, θ(1)) , (Y (2) mis, θ(2)),........ It converges in distribution to P(Ymis, θ|Yobs). Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 18 / 24
  • 20. Method: Application in SAS Multiple Imputation step 1 MCMC method used in conjunction with the IMPUTE=MONOTONE option to create an imputed data set with a monotone missing pattern. Variables include treatment group and VH scores at baseline and post-baseline analysis visits. This method implies that VH scores are analysed as continuous variables and treatment group is a dummy variable. SAS Code proc mi data=dset1 out=MIstep1 seed=27160 nimpute=1000 noprint ; mcmc impute=monotone chain=multiple ; var armn baseline week2 week4 week6 week8 week10 week12 week14 week16; run; Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 19 / 24
  • 21. Method: Application in SAS Multiple Imputation step 2 Missing data are imputed with a regression method by using the monotone data set from step 1 Variables include treatment group, stratification variables and VH scores at baseline and post-baseline analysis visits. This method implies that VH scores are analysed as continuous variables. Output data set from step 1 (after rounding) is used as input data set for step 2. Only 1 imputation in step 2 (for each imputation from step 1). SAS Code proc mi data=MIstep1r out=MIstep2 seed=54320 nimpute=1 noprint ; var armn stratum baseline week2 week4 week6 week8 week10 week12 week14 week16; class armn stratum; monotone reg; run; Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 20 / 24
  • 22. Result: Tabular representation of efficacy endpoint Table 1 : Change from baseline in VH Score to Week 16, MITT population Vitreous Hazre (Miami 9-step scale) Placebo (N=43) Treatment A (N=57) Baseline Number 43 57 Mean (SD) 4.47 (1.96) 4.68 (2.49) Median 5.00 5.00 Min : Max 1.0 : 8.0 1.0 : 8.0 Week 16 Number 28 44 Mean (SD) 4.18 (2.47) 4.64 (2.30) Median 3.50 5.00 Min : Max 1.0 : 8.0 1.0 : 8.0 Change from Baseline Number 28 44 Mean (SD) -0.11 (3.58) -0.25 (3.36) Median 0.50 -1.00 Min : Max -7.0 : 6.0 -7.0 : 6.0 Analysis : Original Data LS Means (SE) -0.42 (0.414) 0.06 (0.331) 90% CI (-1.103 to 0.262) (-0.485 to 0.604) LS Mean differences (SE) vs. Placebo 0.48 (0.530) 90% CI (-0.393 to 1.354) p-value 0.3653 Analysis : BOCF LS Means (SE) -0.18 (0.340) -0.10 (0.295) 90% CI (-0.739 to 0.380) (-0.590 to 0.383) LS Mean differences (SE) vs. Placebo 0.08 (0.450) 90% CI (-0.665 to 0.817) p-value 0.8659 Analysis : LOCF LS Means (SE) 0.15 (0.329) 0.15 (0.329) 90% CI (-0.395 to 0.689) (-0.197 to 0.745) LS Mean differences (SE) vs. Placebo a 0.13 (0.436) 90% CI (-0.591 to 0.845) p-value 0.7704 Analysis : Imputation (Bayesian) LS Means (SE) -0.78 (0.380) -0.11 (0.317) 90% CI (-1.527 to -0.335) (-0.729 to 0.515) LS Mean differences (SE) vs. Placebo 0.67 (0.495) 90% CI (-0.298 to 1.645) p-value 0.1739 Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 21 / 24
  • 23. Conclusion and Discussion From Table 1, we see that, LS mean change in VH score, from baseline to week 16 is higher in the Treatment A compared to the placebo group, but also tends to statistically significant difference for imputation by using Bayesian . Improvement of p-values has been noticed for imputation by using Bayesian (0.1739) than LOCF (0.7704) & BOCF (0.8659) compared to original data (0.3653). Bayesian approach lends itself naturally different choices of prior distributions encoding assumptions about the missing data process. It offers possibility of including informative prior information about missing data process.But models can become computationally challenging. The procedure can be used in the data preparation steps before calling the analysis model to simplify the clinical efficacy data analysis process. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 22 / 24
  • 24. References Allison, P.D. (2000). Multiple Imputation for Missing Data: A Cautionary Tale. Sociological Methods and Research, 28: 301-309. Barnard J, Rubin DB (1999). Small-Sample Degrees of Freedom with Multiple Imputation. Biometrika, 86: 948-955. National Research Council. The Prevention and Treatment of Missing Data in Clinical Trials. The Panel on Handling Missing Data in Clinical Trials Rubin DB (1976). Inference and Missing Data. Biometrika, 63: 581-592. Rubin DB (1987). Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons. Rubin DB (1996). Imputation After 18+ Years. Journal of the American Statistical Association, 91: 473-489. Yuan, Yang (2011). Multiple Imputation Using SAS Software. Journal of Statistical Software, 45(6): 1-25. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 23 / 24
  • 25. Pratibha (Cytel) & Reetabrata (TCS) Imputation of Missing Data through Bayesian ConSPIC, 8th - 10th Oct, 2015 24 / 24