19
$\begingroup$

Is the variance of the mean of a set of possibly dependent random variables less than or equal to the average of their respective variances?

Mathematically, given random variables $X_1, X_2, ..., X_n$ that may be dependent:

Let $\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$ be the mean of these random variables.

Is it true that:

$$\text{Var}(\bar{X}) \leq \frac{1}{n}\sum_{i=1}^n \text{Var}(X_i)$$

I know that for independent random variables, we have the following equality:

$$\text{Var}(\bar{X}) = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i)$$

Which clearly satisfies the inequality. However, I'm unsure if this holds for dependent variables.

If this inequality is true, is there a proof or intuitive explanation?

If it's not always true, are there conditions under which it holds? What about the following inequality? $$\text{Var}(\bar{X}) \leq \text{Max}_{i=1}^n \text{Var}(X_i)$$

Any insights, proofs, or counterexamples would be greatly appreciated. Thank you!

$\endgroup$

4 Answers 4

21
$\begingroup$

Yes, it is true. Here is a proof. $$ \begin{align} \newcommand{\Var}{\operatorname{Var}} &\Var(\overline{X}) \\ &= \frac1{n^2}\Var\left(\sum_{i=1}^n X_i\right) \\ &=\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\text{Cov}(X_i,X_j) \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\sqrt{\text{Var}(X_i)\cdot \Var(X_j)} \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\frac{\Var X_i+\Var X_j}{2} \\ &= \frac1n\sum_{i=1}^n \Var(X_i). \end{align} $$

$\endgroup$
13
$\begingroup$

In general, one has : $$ \begin{align} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) &= \sum_{i,j=0}^n \operatorname{Cov}(X_i,X_j) \end{align} $$ Now, the well-known inequality $ab \le \frac{1}{2}(a^2+b^2)$ permits to write : $$ \begin{align} \operatorname{Cov}(X,Y) &= \Bbb{E}\left[(X-\Bbb{E}[X])(Y-\Bbb{E}[Y])\right] \\ &\le \frac{1}{2}\Bbb{E}\left[(X-\Bbb{E}[X])^2 + (Y-\Bbb{E}[Y])^2\right] \\ &= \frac{1}{2} \left(\operatorname{Var}(X) + \operatorname{Var}(Y)\right) \end{align} $$ Hence $$ \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{2} \sum_{i,j=0}^n \left(\operatorname{Var}(X_i) + \operatorname{Var}(X_j)\right) = n \sum_{k=0}^n \operatorname{Var}(X_k) $$ and finally $$ \operatorname{Var}\left(\bar{X}\right) = \frac{1}{n^2} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{n} \sum_{k=0}^n \operatorname{Var}(X_k) $$

$\endgroup$
12
$\begingroup$

A way to see this at a glance is that real random variables form an inner product space, with $\langle X,Y \rangle = \mathbb{E}XY$. The norm induced by this inner product is $\|X\|^2=\mathbb{E}X^2$, and both $\|X\|$ and $\|X\|^2$ are always convex for an inner product space.

Furthermore $\mathbb{E}X$ is linear, so $f(X)=X-\mathbb{E}X$ is linear, and a convex function composed with a linear transformation is always still convex.

This gives us that $$Var(X)=\langle X-\mathbb{E}X , X-\mathbb{E}X \rangle$$ is convex, from which your conjecture immediately follows.

$\endgroup$
4
$\begingroup$

$$\text{Var}\left(\sum X_i\right) = \sum\limits_i \text{Var}\left( X_i\right) +\sum\limits_i \sum\limits_{j\not=i} \text{Cov}\left( X_i,X_j\right)$$ is maximised when the covariances take their maximum possible positive values, which happens when all the correlations are $+1$.

So the highest variance case for $\sum X_i$ and thus $\bar X$ will be when there is perfect positive correlation between the $X_i$, in which case $$\text{SD}(\sum X_i) = \sum \text{SD}(X_i)$$ giving $$\text{Var}(\bar X) = \frac1{n^2} \text{Var}\left(\sum X_i\right)=\left(\frac1n \sum \text{SD}(X_i)\right)^2 .$$

Then, using the Cauchy–Schwarz inequality:

$$\left(\frac1n \sum \text{SD}(X_i)\right)^2 \le \frac1n \sum \left(\text{SD}(X_i)^2\right) = \frac{1}{n}\sum \text{Var}(X_i)$$

with equality only when all the $\text{SD}(X_i)$ are equal.

So your $\text{Var}(\bar{X}) \leq \frac{1}{n}\sum \text{Var}(X_i)$ is correct,

with equality only when $X_i-E[X_i]=X_j-E[X_j]$ for all $i,j$ so when you have identical variances and perfect positive correlation (though possibly different expectations).

$\endgroup$
2
  • 1
    $\begingroup$ While the situation described in the first sentence seems plausible, the description does not mathematically justify its correctness. $\endgroup$ Commented Jul 7 at 17:10
  • $\begingroup$ @GregMartin it is well known. I have added an additional introductory line $\endgroup$
    – Henry
    Commented Jul 7 at 17:20

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .