Type of Document Dissertation Author Smith, Derek Kyle URN etd-12232016-090335 Title Empirical Bayes Methods for Everyday Statistical Problems Degree PhD Department Biostatistics Advisory Committee

Advisor Name Title William Dupont Committee Chair Jeffrey Blume Committee Member Robert Greevy Committee Member Sonya Sterba Committee Member Keywords

- empirical bayes
- shrinkage
- probabilistic calibration
- clinical trial design
- acute kidney injury
Date of Defense 2016-12-16 Availability unrestricted AbstractThis work develops an empirical Bayes approach to statistical difficulties that arise in real-world applications. Empirical Bayes methods use Bayesian machinery to obtain statistical estimates, but rather than having a prior distribution for model parameters that is assumed, the prior is estimated from the observed data. Misuse of these methods as though the resulting “posterior distributions” were true Bayes posteriors has lead to limited adoption, but careful application can result in improved point estimation in a wide variety of circumstances.The first problem solved via an empirical Bayes approach deals with surrogate outcome measures. Theory for using surrogate outcomes for inference in clinical trials has been developed over the last 30 years starting with the development of the Prentice criteria for surrogate outcomes in 1989. Theory for using surrogates outside of the clinical trials arena or to develop risk score models is lacking. In this work we propose criteria similar to the Prentice criteria for using surrogates to develop risk scores. We then identify a particular type of surrogate which violates the proposed criteria in a particular way, which we deem a partial surrogate. The behavior of partial surrogates is investigated through a series of simulation studies and an empirical Bayes weighting scheme is developed which alleviates their pathologic behavior. It is then hypothesized that a common clinical measure, change in perioperative serum creatinine level from baseline, is actually a partial surrogate. It is demonstrated that it displays the same sort of pathologic behaviors seen in the simulation study and that they are similarly rectified using the proposed method. The result is a more acurate predictive model for both short and long-term measure of kidney function.

The second problem solved deals with likelihood support intervals. Likelihood intervals are a way to quantify statistical uncertainty. Unlike other, more common methods for interval estimation, every value that is included in a support interval must be supported by the data at a specified level. Support intervals have not seen wide usage in practice due to a philosophic belief amongst many in the field that frequency-based or probabilistic inference is somehow stronger than inference based soley on the likelihood. In this work we develop a novel procedure based on the bootstrap for estimating the frequency characteristics of likelihood intervals. The resulting intervals have both the frequency properties of the set prized by frequentists as well as each individual member of the set attaining a specified support level. An R package, supportInt, was developed to calculate these intervals and published on the Comprehensive R Archive Network.

The third problem addressed deals with the design of clinical trials when the potential protocols for the intervention are highly variable. A meta-analysis is presented in which the difficulties this situation presents becomes apparent. The results of this analysis of randomized trials of perioperative beta-blockade as a potential intervention to prevent my- ocardial infarction in the surgical setting are completely dependent on the statistical model chosen. In particular, which elements of the trial protocol are pooled and which are al- lowed by the model to impact the estimate of treatment efficacy completely determine the inference drawn from the data. This problem occurs largely because the trials conducted on the intervention of interest are not richly variable in some aspects of protocol. In this section it is demonstrated that large single protocol designs that are frequently advocated for can be replaced by multi-arm protocols to more accurately assess the question of an intervention’s potential efficacy. Simulation studies are conducted that make use of a novel adaptive randomization scheme based on an empirically estimated likelihood function. A tool is made available in a Shiny app that allows for the conduct of further studies by the reader under a wide variety of conditions.

Files

Filename Size Approximate Download Time (Hours:Minutes:Seconds)

28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access DerekKSmith.pdf1.88 Mb 00:08:41 00:04:28 00:03:54 00:01:57 00:00:10

Browse All Available ETDs by
( Author |
Department )

If you have more questions or technical problems, please Contact LITS.