A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-12232016-090335

Type of Document Dissertation
Author Smith, Derek Kyle
URN etd-12232016-090335
Title Empirical Bayes Methods for Everyday Statistical Problems
Degree PhD
Department Biostatistics
Advisory Committee
Advisor Name Title
William Dupont Committee Chair
Jeffrey Blume Committee Member
Robert Greevy Committee Member
Sonya Sterba Committee Member
  • empirical bayes
  • shrinkage
  • probabilistic calibration
  • clinical trial design
  • acute kidney injury
Date of Defense 2016-12-16
Availability unrestricted
This work develops an empirical Bayes approach to statistical difficulties that arise in real-world applications. Empirical Bayes methods use Bayesian machinery to obtain statistical estimates, but rather than having a prior distribution for model parameters that is assumed, the prior is estimated from the observed data. Misuse of these methods as though the resulting “posterior distributions” were true Bayes posteriors has lead to limited adoption, but careful application can result in improved point estimation in a wide variety of circumstances.

The first problem solved via an empirical Bayes approach deals with surrogate outcome measures. Theory for using surrogate outcomes for inference in clinical trials has been developed over the last 30 years starting with the development of the Prentice criteria for surrogate outcomes in 1989. Theory for using surrogates outside of the clinical trials arena or to develop risk score models is lacking. In this work we propose criteria similar to the Prentice criteria for using surrogates to develop risk scores. We then identify a particular type of surrogate which violates the proposed criteria in a particular way, which we deem a partial surrogate. The behavior of partial surrogates is investigated through a series of simulation studies and an empirical Bayes weighting scheme is developed which alleviates their pathologic behavior. It is then hypothesized that a common clinical measure, change in perioperative serum creatinine level from baseline, is actually a partial surrogate. It is demonstrated that it displays the same sort of pathologic behaviors seen in the simulation study and that they are similarly rectified using the proposed method. The result is a more acurate predictive model for both short and long-term measure of kidney function.

The second problem solved deals with likelihood support intervals. Likelihood intervals are a way to quantify statistical uncertainty. Unlike other, more common methods for interval estimation, every value that is included in a support interval must be supported by the data at a specified level. Support intervals have not seen wide usage in practice due to a philosophic belief amongst many in the field that frequency-based or probabilistic inference is somehow stronger than inference based soley on the likelihood. In this work we develop a novel procedure based on the bootstrap for estimating the frequency characteristics of likelihood intervals. The resulting intervals have both the frequency properties of the set prized by frequentists as well as each individual member of the set attaining a specified support level. An R package, supportInt, was developed to calculate these intervals and published on the Comprehensive R Archive Network.

The third problem addressed deals with the design of clinical trials when the potential protocols for the intervention are highly variable. A meta-analysis is presented in which the difficulties this situation presents becomes apparent. The results of this analysis of randomized trials of perioperative beta-blockade as a potential intervention to prevent my- ocardial infarction in the surgical setting are completely dependent on the statistical model chosen. In particular, which elements of the trial protocol are pooled and which are al- lowed by the model to impact the estimate of treatment efficacy completely determine the inference drawn from the data. This problem occurs largely because the trials conducted on the intervention of interest are not richly variable in some aspects of protocol. In this section it is demonstrated that large single protocol designs that are frequently advocated for can be replaced by multi-arm protocols to more accurately assess the question of an intervention’s potential efficacy. Simulation studies are conducted that make use of a novel adaptive randomization scheme based on an empirically estimated likelihood function. A tool is made available in a Shiny app that allows for the conduct of further studies by the reader under a wide variety of conditions.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  DerekKSmith.pdf 1.88 Mb 00:08:41 00:04:28 00:03:54 00:01:57 00:00:10

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.