TITLE: Statistical Estimation and Changepoint Detection Methods in Public Health Surveillance
ABSTRACT:
This thesis focuses on assessing and improving statistical methods implemented in two areas of public health research. The first topic involves estimation of national influenza-associated mortality rates via mathematical modeling. The second topic involves the timely detection of infectious disease outbreaks using statistical process control monitoring.
For over fifty years, the Centers for Disease Control and Prevention has been estimating annual rates of U.S. deaths attributable to influenza. These estimates have been used to determine costs and benefits associated with influenza prevention and control strategies. Quantifying the effect of influenza on mortality, however, can be challenging since influenza infections typically are not confirmed virologically nor specified on death certificates. Consequently, a wide range of ecologically based, mathematical modeling approaches have been applied to specify the association between influenza and mortality. To date, all influenza-associated death estimates have been based on mortality data first aggregated at the national level and then modeled. Unfortunately, there are a number of local-level seasonal factors that may confound the association between influenza and mortality thus suggesting that data be modeled at the local level and then pooled to make national estimates of death.
The first component of the thesis topic involving mortality estimation addresses this issue by introducing and implementing a two-stage hierarchical Bayesian modeling approach. In the first stage, city-level data with varying trends in mortality and weather were modeled using semi-parametric, generalized additive models. In the second stage, the log-relative risk estimates calculated for each city in stage 1 represented the “outcome” variable, and were modeled two ways: (1) assuming spatial independence across cities using a Bayesian generalized linear model, and (2) assuming correlation among cities using a Bayesian spatial correlation model. Results from these models were compared to those from a more-conventional approach.
The second component of this topic examines the extent to which seasonal confounding and collinearity affect the relationship between influenza and mortality at the local (city) level. Disentangling the effects of temperature, humidity, and other seasonal confounders on the association between influenza and mortality is challenging since these covariates are often temporally collinear with influenza activity. Three modeling strategies with varying representations of background seasonality were compared. Seasonal covariates entered into the model may have been measured (e.g., ambient temperature) or unmeasured (e.g., time-based smoothing splines or Fourier terms). An advantage of modeling background seasonality via time splines is that the amount of seasonal curvature can be controlled by the number of degrees of freedom specified for the spline. A comparison of the effects of influenza activity on mortality based on these varying representations of seasonal confounding is assessed.
The third component of this topic explores the relationship between mortality rates and influenza activity using a flexible, natural cubic spline function to model the influenza term. The conventional approach of fitting influenza-activity terms linearly in regression was found to be too constraining. Results show that the association is best represented nonlinearly.
The second area of focus in this thesis involves infectious disease outbreak detection. A fundamental goal of public health surveillance, particularly syndromic surveillance, is the timely detection of increases in the rate of unusual events. In syndromic surveillance, a significant increase in the incidence of monitored disease outcomes would trigger an alert, possibly prompting the implementation of an intervention strategy. Public health surveillance generally monitors count data (e.g., counts of influenza-like illness, sales of over-the-counter remedies, and number of visits to outpatient clinics). Statistical process control charts, designed for quality control monitoring in industry, have been widely adapted for use in disease and syndromic surveillance. The behavior of these detection methods on discrete distributions, however, has not been explored in detail.
For this component of the thesis, a simulation study was conducted to compare the CuSum and EWMA methods for detection of increases in negative binomial rates with varying amounts of dispersion. The goal of each method is to detect an increase in the mean number of cases as soon as possible after an upward rate shift has occurred. The performance of the CuSum and EWMA detection methods is evaluated using the conditional expected delay criterion, which is a measure of the detection delay, i.e., the time between the occurrence of a shift and when that shift is detected. Detection capabilities were explored under varying shift sizes and times at which the shifts occurred.