# Wasserman's AoS, Chapter 7, Question 5

We show that the covariance $$\text{Cov} (\hat F_n(x), \hat F_n(y))$$ is $\frac{1}{n} F(x) \left( 1 + F(y) \right) ,$ for $$x \le y$$. For $$x \ge y$$, we have $$\frac{1}{n} F(y) \left( 1 + F(x) \right)$$.

Using linearity of expectation and the fact that $$\mathbb E \hat F_n(y) = F(y)$$, we obtain

$\mathbb E (\hat F_n(x) - F(x)) (\hat F_n(y) - F(y)) = \mathbb E (\hat F_n(x)\hat F_n(y)) - F(x)F(y) .$

By definition of the empirical cumulative distribution function, the first term is equal to

$\frac{1}{n^2} \sum_{i, j} \mathbb E (I(X_i \le x)I(X_j \le y)) = \frac{1}{n^2} \left( \sum_{i = j} \mathbb E (I(X_i \le x)I(X_i \le y)) + \sum_{i \ne j} \mathbb E (I(X_i \le x)I(X_j \le y)) \right) .$

We now simplify the two summations:

1. Without loss of generality, we may assume that $$x \le y$$. With this assumption, $$I(X_i \le x)I(X_i \le y) = I(X_i \le x)$$.

2. Note that the Bernoulli variables $$I(X_i \le x)$$ and $$I(X_j \le y)$$ are independent for $$i \ne j$$ since $$X_i$$ and $$X_j$$ are independent. This implies that the expectation of the product is the product of the expectations.

These two facts allow us to rewrite the first term as

\begin{align} \frac{1}{n^2} \left( \sum_{i = j} \mathbb E (I(X_i \le x)) + \sum_{i \ne j} \mathbb E (I(X_i \le x)) \mathbb E (I(X_j \le y)) \right) &= \frac{1}{n^2} \left( \sum_{i = j} F(x) + \sum_{i \ne j} F(x)F(y) \right) \\ &= \frac{1}{n} F(x) ( 1 + (n-1) F(y) ) \end{align} .

Putting everything together, we see that the covariance is

\begin{align} \frac{1}{n} F(x) (1 + (n-1) F(y)) - F(x)F(y) &= \frac{1}{n} F(x) (1 - F(y)) \end{align}

for $$x \le y$$.