Try a new search

Format these results:

Searched for:

in-biosketch:yes

person:diazi07

Total Results:

143


Variable importance and prediction methods for longitudinal problems with missing variables

Díaz, Iván; Hubbard, Alan; Decker, Anna; Cohen, Mitchell
We present prediction and variable importance (VIM) methods for longitudinal data sets containing continuous and binary exposures subject to missingness. We demonstrate the use of these methods for prognosis of medical outcomes of severe trauma patients, a field in which current medical practice involves rules of thumb and scoring methods that only use a few variables and ignore the dynamic and high-dimensional nature of trauma recovery. Well-principled prediction and VIM methods can provide a tool to make care decisions informed by the high-dimensional patient's physiological and clinical history. Our VIM parameters are analogous to slope coefficients in adjusted regressions, but are not dependent on a specific statistical model, nor require a certain functional form of the prediction regression to be estimated. In addition, they can be causally interpreted under causal and statistical assumptions as the expected outcome under time-specific clinical interventions, related to changes in the mean of the outcome if each individual experiences a specified change in the variable (keeping other variables in the model fixed). Better yet, the targeted MLE used is doubly robust and locally efficient. Because the proposed VIM does not constrain the prediction model fit, we use a very flexible ensemble learner (the SuperLearner), which returns a linear combination of a list of user-given algorithms. Not only is such a prediction algorithm intuitive appealing, it has theoretical justification as being asymptotically equivalent to the oracle selector. The results of the analysis show effects whose size and significance would have been not been found using a parametric approach (such as stepwise regression or LASSO). In addition, the procedure is even more compelling as the predictor on which it is based showed significant improvements in cross-validated fit, for instance area under the curve (AUC) for a receiver-operator curve (ROC). Thus, given that 1) our VIM applies to any model fitting procedure, 2) under assumptions has meaningful clinical (causal) interpretations and 3) has asymptotic (influence-curve) based robust inference, it provides a compelling alternative to existing methods for estimating variable importance in high-dimensional clinical (or other) data.
PMCID:4376910
PMID: 25815719
ISSN: 1932-6203
CID: 5304432

Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend on the Natural Value of Treatment Using Observational Data, by Jessica Young, Miguel Hernán, and James Robins

van der Laan, Mark J; Luedtke, Alexander R; Díaz, Iván
Young, Hernán, and Robins consider the mean outcome under a dynamic intervention that may rely on the natural value of treatment. They first identify this value with a statistical target parameter, and then show that this statistical target parameter can also be identified with a causal parameter which gives the mean outcome under a stochastic intervention. The authors then describe estimation strategies for these quantities. Here we augment the authors' insightful discussion by sharing our experiences in situations where two causal questions lead to the same statistical estimand, or the newer problem that arises in the study of data adaptive parameters, where two statistical estimands can lead to the same estimation problem. Given a statistical estimation problem, we encourage others to always use a robust estimation framework where the data generating distribution truly belongs to the statistical model. We close with a discussion of a framework which has these properties.
PMCID:4666557
PMID: 26636024
ISSN: 2193-3677
CID: 5304452

Estimating population treatment effects from a survey subsample

Rudolph, Kara E; Díaz, Iván; Rosenblum, Michael; Stuart, Elizabeth A
We considered the problem of estimating an average treatment effect for a target population using a survey subsample. Our motivation was to generalize a treatment effect that was estimated in a subsample of the National Comorbidity Survey Replication Adolescent Supplement (2001-2004) to the population of US adolescents. To address this problem, we evaluated easy-to-implement methods that account for both nonrandom treatment assignment and a nonrandom 2-stage selection mechanism. We compared the performance of a Horvitz-Thompson estimator using inverse probability weighting and 2 doubly robust estimators in a variety of scenarios. We demonstrated that the 2 doubly robust estimators generally outperformed inverse probability weighting in terms of mean-squared error even under misspecification of one of the treatment, selection, or outcome models. Moreover, the doubly robust estimators are easy to implement and provide an attractive alternative to inverse probability weighting for applied epidemiologic researchers. We demonstrated how to apply these estimators to our motivating example.
PMCID:4172168
PMID: 25190679
ISSN: 1476-6256
CID: 5304972

Targeted Maximum Likelihood Estimation using Exponential Families [PrePrint]

Diaz, Ivan; Rosenblum, Michael
ORIGINAL:0015896
ISSN: 2331-8422
CID: 5305492

Sensitivity analysis for causal inference under unmeasured confounding and measurement error problems

Díaz, Iván; van der Laan, Mark J
In this article, we present a sensitivity analysis for drawing inferences about parameters that are not estimable from observed data without additional assumptions. We present the methodology using two different examples: a causal parameter that is not identifiable due to violations of the randomization assumption, and a parameter that is not estimable in the nonparametric model due to measurement error. Existing methods for tackling these problems assume a parametric model for the type of violation to the identifiability assumption and require the development of new estimators and inference for every new model. The method we present can be used in conjunction with any existing asymptotically linear estimator of an observed data parameter that approximates the unidentifiable full data parameter and does not require the study of additional models.
PMID: 24246288
ISSN: 1557-4679
CID: 5304382

Assessing the causal effect of policies: an example using stochastic interventions

Díaz, Iván; van der Laan, Mark J
Assessing the causal effect of an exposure often involves the definition of counterfactual outcomes in a hypothetical world in which the stochastic nature of the exposure is modified. Although stochastic interventions are a powerful tool to measure the causal effect of a realistic intervention that intends to alter the population distribution of an exposure, their importance to answer questions about plausible policy interventions has been obscured by the generalized use of deterministic interventions. In this article, we follow the approach described in Díaz and van der Laan (2012) to define and estimate the effect of an intervention that is expected to cause a truncation in the population distribution of the exposure. The observed data parameter that identifies the causal parameter of interest is established, as well as its efficient influence function under the non-parametric model. Inverse probability of treatment weighted (IPTW), augmented IPTW and targeted minimum loss-based estimators (TMLE) are proposed, their consistency and efficiency properties are determined. An extension to longitudinal data structures is presented and its use is demonstrated with a real data example.
PMID: 24246287
ISSN: 1557-4679
CID: 5304372

Time-dependent prediction and evaluation of variable importance using superlearning in high-dimensional clinical data

Hubbard, Alan; Munoz, Ivan Diaz; Decker, Anna; Holcomb, John B; Schreiber, Martin A; Bulger, Eileen M; Brasel, Karen J; Fox, Erin E; del Junco, Deborah J; Wade, Charles E; Rahbar, Mohammad H; Cotton, Bryan A; Phelan, Herb A; Myers, John G; Alarcon, Louis H; Muskat, Peter; Cohen, Mitchell J
BACKGROUND:Prediction of outcome after injury is fraught with uncertainty and statistically beset by misspecified models. Single-time point regression only gives prediction and inference at one time, of dubious value for continuous prediction of ongoing bleeding. New statistical machine learning techniques such as SuperLearner (SL) exist to make superior prediction at iterative time points while evaluating the changing relative importance of each measured variable on an outcome. This then can provide continuously changing prediction of outcome and evaluation of which clinical variables likely drive a particular outcome. METHODS:PROMMTT data were evaluated using both naive (standard stepwise logistic regression) and SL techniques to develop a time-dependent prediction of future mortality within discrete time intervals. We avoided both underfitting and overfitting using cross validation to select an optimal combination of predictors among candidate predictors/machine learning algorithms. SL was also used to produce interval-specific robust measures of variable importance measures (VIM resulting in an ordered list of variables, by time point) that have the strongest impact on future mortality. RESULTS:Nine hundred eighty patients had complete clinical and outcome data and were included in the analysis. The prediction of ongoing transfusion with SL was superior to the naive approach for all time intervals (correlations of cross-validated predictions with the outcome were 0.819, 0.789, 0.792 for time intervals 30-90, 90-180, 180-360, >360 minutes). The estimated VIM of mortality also changed significantly at each time point. CONCLUSION/CONCLUSIONS:The SL technique for prediction of outcome from a complex dynamic multivariate data set is superior at each time interval to standard models. In addition, the SL VIM at each time point provides insight into the time-specific drivers of future outcome, patient trajectory, and targets for clinical intervention. Thus, this automated approach mimics clinical practice, changing form and content through time to optimize the accuracy of the prognosis based on the evolving trajectory of the patient.
PMCID:3744063
PMID: 23778512
ISSN: 2163-0763
CID: 5304912

Targeted Data Adaptive Estimation of the Causal Dose-Response Curve

Diaz, Ivan; Van der Laan, Mark J.
ISI:000218558300001
ISSN: 2193-3677
CID: 5304812

Population intervention causal effects based on stochastic interventions

Muñoz, Iván Díaz; van der Laan, Mark
Estimating the causal effect of an intervention on a population typically involves defining parameters in a nonparametric structural equation model (Pearl, 2000, Causality: Models, Reasoning, and Inference) in which the treatment or exposure is deterministically assigned in a static or dynamic way. We define a new causal parameter that takes into account the fact that intervention policies can result in stochastically assigned exposures. The statistical parameter that identifies the causal parameter of interest is established. Inverse probability of treatment weighting (IPTW), augmented IPTW (A-IPTW), and targeted maximum likelihood estimators (TMLE) are developed. A simulation study is performed to demonstrate the properties of these estimators, which include the double robustness of the A-IPTW and the TMLE. An application example using physical activity data is presented.
PMCID:4117410
PMID: 21977966
ISSN: 1541-0420
CID: 5304902

Critical Mediators of Coagulopathy After Trauma [Meeting Abstract]

Kutcher, M. E.; Diaz, I.; Redick, B. J.; Vilardi, R. F.; Nelson, M. F.; Hubbard, A.; Cohen, M. J.
ISI:000308398600059
ISSN: 0041-1132
CID: 5304722