BDA3 Chapter 5 Exercise 13

Here’s my solution to exercise 13, chapter 5, of Gelman’s Bayesian Data Analysis (BDA), 3rd edition. There are solutions to some of the exercises on the book’s webpage.

\(\DeclareMathOperator{\dbinomial}{Binomial} \DeclareMathOperator{\dbern}{Bernoulli} \DeclareMathOperator{\dpois}{Poisson} \DeclareMathOperator{\dnorm}{Normal} \DeclareMathOperator{\dt}{t} \DeclareMathOperator{\dcauchy}{Cauchy} \DeclareMathOperator{\dexponential}{Exp} \DeclareMathOperator{\duniform}{Uniform} \DeclareMathOperator{\dgamma}{Gamma} \DeclareMathOperator{\dinvgamma}{InvGamma} \DeclareMathOperator{\invlogit}{InvLogit} \DeclareMathOperator{\logit}{Logit} \DeclareMathOperator{\ddirichlet}{Dirichlet} \DeclareMathOperator{\dbeta}{Beta}\)

df <- read_csv('data/chapter_03_exercise_08.csv') %>% 
  filter(type == 'residential' & bike_route) %>% 
  transmute(
    i = 1:n(),
    bikes,
    other,
    total = bikes + other,
    rate = bikes / total
  )
Subset of table 3.3 restricted to residential streets that are bike routes
i bikes other total rate
1 16 58 74 0.2162162
2 9 90 99 0.0909091
3 10 48 58 0.1724138
4 13 57 70 0.1857143
5 19 103 122 0.1557377
6 20 57 77 0.2597403
7 18 86 104 0.1730769
8 17 112 129 0.1317829
9 35 273 308 0.1136364
10 55 64 119 0.4621849

We’ll use the prior on \(\alpha\), \(\beta\) given in equation 5.9 and implement it in Stan. Note that stan works on the log scale, so we increment the posterior density (= target) by \(\log\left( (\alpha + \beta)^{-\frac{5}{2}} \right) = -\frac{5}{2}\log(\alpha + \beta)\). Here is the model code:

model <- rstan::stan_model('src/ex_05_13.stan')
S4 class stanmodel 'ex_05_13' coded as follows:
data {
  int<lower = 1> n;
  int<lower = 0> total[n];
  int<lower = 0> bikes[n];
}

parameters {
  real<lower = 0> alpha;
  real<lower = 0> beta;
  vector<lower = 0, upper = 1>[n] theta;
}

model {
  // joint prior on alpha, beta
  target += -(5. / 2.) * log(alpha + beta); 
  // theta prior
  theta ~ beta(alpha, beta); 
  // likelihood
  bikes ~ binomial(total, theta); 
} 

Now we calculate the posterior, using tidybayes to take care of passing the data to stan.

fit <- model %>% 
  rstan::sampling(data = tidybayes::compose_data(df)) %>% 
  tidybayes::recover_types(df)

We don’t show any diagnostics here, but there are no divergences, the rhat is very close to 1, the effective sample size is large, and the traces look reasonable.

Now we can draw some samples from the posterior.

draws <- fit %>% 
  tidybayes::spread_draws(alpha, beta, theta[i]) 
Contour plot of the marginal posterior density of (log(α/β), log(α + β)) for the residential bike route example.
Contour plot of the marginal posterior density of (log(α/β), log(α + β)) for the residential bike route example.

We can also use our posterior draws to plot the observed (= unpooled) rates against the estimated rates. In this case, there are no large differences, although smaller rates seem to be estimated slightly higher than observed and larger rates lower than observed. All observed rates are within the posterior intervals of the estimates.

Posterior medians and 95% intervals of bike observation rates based on simulations from the join posterior distribution.
Posterior medians and 95% intervals of bike observation rates based on simulations from the join posterior distribution.

To calculate a posterior interval for the average underlying proportion of bike traffic, we sample \(\alpha, \beta\) from the posterior, then draw a new \(\tilde\theta \sim \dbeta(\alpha, \beta)\). It wouldn’t be correct to use the values of \(\theta_j\) from the model parameters, since those are estimates from known streets. If we also want an estimate of the number of bikes observed on a new street (where 100 total vehicles go by), then we draw \(\tilde y \sim \dbinomial(100, \tilde\theta)\) . The posterior intervals are then just the desired quantiles of the drawn parameters.

cis <- draws %>% 
  filter(i == 1) %>% 
  mutate(
    theta = rbeta(n(), alpha, beta),
    y = rbinom(n(), 100, theta)
  ) %>% 
  tidybayes::median_qi() %>% 
  select(matches('y|theta')) 

The 95% posterior interval for \(\tilde\theta\) is (3.65%, 49.1%), which includes 10 of the 10 observed rates. The 95% posterior interval of \(\tilde y\) is (3, 49.025), which includes 9 of the 10 observed values. These intervals seem reasonable, although they are fairly wide. It’s difficult to make a statement about how useful these could be in application without a concrete idea of what that application is.