A Hierarchical Modeling Details

Total Page:16

File Type:pdf, Size:1020Kb

A Hierarchical Modeling Details

Technical Supplement for “Disparities in Temporal and Geographic Patterns of Declining Heart Disease Mortality by Race and Sex in the United States, 1973–2010” by Vaughan et al. This is the technical supplement for “Disparities in Temporal and Geographic Patterns of Declining Heart Disease Mortality by Race and Sex in the United States, 1973–2010”. Here we will describe how we model our heart disease mortality rates while accounting for spatial-, temporal-, and between race-sex correlations. A Hierarchical Modeling Details

We let Y ikt be the heart disease mortality rate (per 100,000) for county i, race/sex group k, at time t, and we assume (1)

2 where xikt denotes a vector of covariates with corresponding regression coefficients, βk, and τikt = 2 τk ∕nikt.  We model the mortality rate by using a normal approximation of a binomial distribution. For instance, suppose the number of deaths, Y *, is binomially distributed from a population size n and event rate p. For large n, a reasonable approximation for this is:

 From this, it is easy to show that the observed mortality rate, Y = Y *∕n, can be modeled as:

2  Furthermore, it is this equation that serves as the basis for our choice of τikt . One attractive feature of this is that it allows rates for counties with larger populations to be more precise (i.e., smaller variance) than those for counties with smaller populations, as expected.

 In our analysis, xiktβk = βkt; i.e., we have an intercept term for each race/sex group at each time. In general, this model can easily allow for the inclusion of other covariate information.

 Zikt denotes a spatiotemporal random effect which also allows for the correlation between the various race/sex groups. o Spatial association is imposed by implementing the commonly used conditional autoregressive (CAR) model introduced by Besag [1] and extended by Besag et al. [2]. As our case is a multivariate spatial model (i.e., we have different maps for each group and each time), we will use the multivariate CAR (MCAR) of Gelfand and Vounatsou [3]. This will allow each county to leverage the information provided by its neighboring counties, inducing spatial smoothness in our estimated mortality rates; this smoothing will be particularly beneficial for those counties with small population sizes, where this borrowing-of-strength will result in increased precision. o Temporal association is imposed by assuming an autoregressive order 1 (AR(1)) model with parameter, ρ. In our MCAR model, this temporal correlation structure

will be denoted by the Nt × Nt matrix, R(ρ). This will allow our estimates at time t to be informed by the estimates at both time t - 1 and t + 1, thereby smoothing each county’s time-interval t rates toward those from the preceding and succeeding time-intervals. o The race/sex covariance structure is modeled using an unstructured covariance matrix. The use of this general structure allows the data to inform the model whether any two race/sex groups are correlated. In our MCAR model, this

covariance structure will be denoted by the Ng × Ng matrix, G.

o Putting these pieces together, we assume the NsNtNg-vector Z is distributed MCAR

, where ⊗ denotes the Kronecker product. Because we “separate” the

temporal and race/sex structures in our model, this type of model is referred to as a separable model; see Quick et al. [4] for a more general nonseparable approach.  To complete our model specification, we need to assign prior distributions to our model parameters. Here, we let π(a) denote the density function for the random variable a.

o We assign a noninformative flat prior on β; π(β) ∝ 1

o Following Gelman [5], we assign a noninformative flat prior on τk: π(τk) ∝ 1

o We assign G an inverse Wishart prior with scale matrix G0 = 0.001 * INg, where INg

is the Ng-dimensional identity matrix, and ν = Ng degrees of freedom (which must

be greater than Ng - 1). While this choice of scale matrix is quite arbitrary, choosing a small degrees of freedom makes this prior quite noninformative,

particularly when you remember that G will be informed by NtNs =58,881 collections of race/sex data. o Finally, we assign a Beta(9,1) prior for ρ. While this value encourages larger values of ρ (i.e., stronger correlations; E[ρ] = 0.9 a priori), here again we have

plenty of data to estimate this parameter from (NgNs =12,396 time-trends). Putting these pieces together, our full hierarchical model is as follows:

π ∝N × MCAR

× InvWish × Beta ×∏ kπ , (2) 2 where ΣY is a diagonal matrix with elements τikt , X is the (NsNgNt ×p) matrix of covariates, and π

2 is the density for τk which corresponds to a flat prior for τk (equivalent to an improper IG(- 1∕2,0)). Upon fitting this model using Markov chain Monte Carlo (MCMC; see Section 3.4 of Carlin and Louis [6] and the references therein), we obtain samples from the posterior distribution of ikt = xikt′βk + Zikt, which we use to generate our results, including our percent declines. An R package to implement this model is currently under development, but code is available upon request — send inquiries to [email protected]. References [1] J. Besag. Spatial interaction and the statistical analysis of lattice systems (with discussion). Journal of the Royal Statistical Society, Series B, 36:192–236, 1974. [2] J. Besag, J. York, and A. Mollié. Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43:1–59, 1991. [3] A. E. Gelfand and P. Vounatsou. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics, 4:11–25, 2003. [4] H. Quick, L. A. Waller, and M. Casper. A nonseparable multivariate space-time model for analyzing county-level heart disease death rates for race and gender. arXiv preprint, arXiv:1507.02741, 2015. [5] A. Gelman. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1:515–533, 2006. [6] B. P. Carlin and T. A. Louis. Bayesian Methods for Data Analysis. Chapman & Hall/CRC, 2008.

Recommended publications