BDA3 Chapter 3 Exercise 2

Here’s my solution to exercise 2, chapter 3, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.

\(\DeclareMathOperator{\dbinomial}{Binomial} \DeclareMathOperator{\dbern}{Bernoulli} \DeclareMathOperator{\dpois}{Poisson} \DeclareMathOperator{\dnorm}{Normal} \DeclareMathOperator{\dcauchy}{Cauchy} \DeclareMathOperator{\dexponential}{Exp} \DeclareMathOperator{\dgamma}{Gamma} \DeclareMathOperator{\dinvgamma}{InvGamma} \DeclareMathOperator{\invlogit}{InvLogit} \DeclareMathOperator{\logit}{Logit} \DeclareMathOperator{\ddirichlet}{Dirichlet} \DeclareMathOperator{\dbeta}{Beta}\)

survey bush dukakis other total
pre-debate 294 307 38 639
post-debate 288 332 19 639

Let \(\theta_\text{pre} \sim \ddirichlet_3(\alpha_\text{pre})\) and \(\theta_\text{post} \sim \ddirichlet_3(\alpha_\text{post})\) be the pre- and post-debate priors. Then the posteriors are

\[ \begin{align} \theta_\text{pre} \mid y &\sim \ddirichlet\left(294 + \alpha_{(\text{pre}, \text{b})}, 307 + \alpha_{(\text{pre}, \text{d})}, 38 + \alpha_{(\text{pre}, \text{o})} \right) \\ \theta_\text{post} \mid y &\sim \ddirichlet\left(288 + \alpha_{(\text{post}, \text{b})}, 332 + \alpha_{(\text{post}, \text{d})}, 19 + \alpha_{(\text{post}, \text{o})} \right) . \end{align} \]

Denote the proportion of support for Bush amongst Bush/Dukakis supporters by \(\phi_\bullet := \frac{\theta_{(\bullet, \text{b})}}{\theta_{(\bullet, \text{b})} + \theta_{(\bullet, \text{d})}}\). The results of the previous exercise show that we can treat this as a beta-binomial problem by simply ignoring the results for other. More precisely,

\[ \begin{align} \phi_\text{pre} \mid y &\sim \dbeta\left(294 + \alpha_{(\text{pre}, \text{b})}, 307 + \alpha_{(\text{pre}, \text{d})} \right) \\ \phi_\text{post} \mid y &\sim \dbeta\left(288 + \alpha_{(\text{post}, \text{b})}, 332 + \alpha_{(\text{post}, \text{d})} \right) \end{align} . \]

Let’s plot these posteriors and their difference \(\delta := (\phi_\text{post} \mid y) - (\phi_\text{pre} \mid y)\).

alpha_bush <- 1
alpha_dukakis <- 1

posteriors <- expand.grid(
    value = seq(0, 1, 0.001),
    survey = df$survey
  ) %>% 
  as_tibble() %>% 
  inner_join(df, by = 'survey') %>% 
  mutate(
    post_bush = bush + alpha_bush,
    post_dukakis = dukakis + alpha_dukakis,
    density = dbeta(value, post_bush, post_dukakis)
  ) 
N <- 10000

shift <- expand.grid(
    draw = 1:N, 
    survey = df$survey
  ) %>% 
  as_tibble() %>% 
  inner_join(df, by = 'survey') %>% 
  mutate(
    post_bush = bush + alpha_bush,
    post_dukakis = dukakis + alpha_dukakis,
    value = rbeta(n(), post_bush, post_dukakis)
  ) %>% 
  select(draw, survey, value) %>% 
  spread(survey, value) %>% 
  mutate(difference = `post-debate` - `pre-debate`) 

There is a 19% probability (the area above 0) that there was a positive shift towards Bush.