Is the variance of the mean of a set of possibly dependent random variables less than the average of their respective variances?

Question

Is the variance of the mean of a set of possibly dependent random variables less than or equal to the average of their respective variances?

Mathematically, given random variables $X_1, X_2, ..., X_n$ that may be dependent:

Let $\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$ be the mean of these random variables.

Is it true that:

$$\text{Var}(\bar{X}) \leq \frac{1}{n}\sum_{i=1}^n \text{Var}(X_i)$$

I know that for independent random variables, we have the following equality:

$$\text{Var}(\bar{X}) = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i)$$

Which clearly satisfies the inequality. However, I'm unsure if this holds for dependent variables.

If this inequality is true, is there a proof or intuitive explanation?

If it's not always true, are there conditions under which it holds? What about the following inequality? $$\text{Var}(\bar{X}) \leq \text{Max}_{i=1}^n \text{Var}(X_i)$$

Any insights, proofs, or counterexamples would be greatly appreciated. Thank you!

Amir · Accepted Answer · 2024-07-08 18:31:56Z

21

Yes, it is true. Here is a proof. $$ \begin{align} \newcommand{\Var}{\operatorname{Var}} &\Var(\overline{X}) \\ &= \frac1{n^2}\Var\left(\sum_{i=1}^n X_i\right) \\ &=\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\text{Cov}(X_i,X_j) \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\sqrt{\text{Var}(X_i)\cdot \Var(X_j)} \\ &\le\frac1{n^2}\sum_{i=1}^n\sum_{j=1}^n\frac{\Var X_i+\Var X_j}{2} \\ &= \frac1n\sum_{i=1}^n \Var(X_i). \end{align} $$

edited Jul 8 at 18:31

Amir

8,4151 gold badge5 silver badges29 bronze badges

answered Jul 7 at 17:23

Mike Earnest

78.2k11 gold badges67 silver badges135 bronze badges

Add a comment |

Abezhiko · Accepted Answer · 2024-07-07 18:31:50Z

In general, one has : $$ \begin{align} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) &= \sum_{i,j=0}^n \operatorname{Cov}(X_i,X_j) \end{align} $$ Now, the well-known inequality $ab \le \frac{1}{2}(a^2+b^2)$ permits to write : $$ \begin{align} \operatorname{Cov}(X,Y) &= \Bbb{E}\left[(X-\Bbb{E}[X])(Y-\Bbb{E}[Y])\right] \\ &\le \frac{1}{2}\Bbb{E}\left[(X-\Bbb{E}[X])^2 + (Y-\Bbb{E}[Y])^2\right] \\ &= \frac{1}{2} \left(\operatorname{Var}(X) + \operatorname{Var}(Y)\right) \end{align} $$ Hence $$ \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{2} \sum_{i,j=0}^n \left(\operatorname{Var}(X_i) + \operatorname{Var}(X_j)\right) = n \sum_{k=0}^n \operatorname{Var}(X_k) $$ and finally $$ \operatorname{Var}\left(\bar{X}\right) = \frac{1}{n^2} \operatorname{Var}\left(\sum_{k=0}^n X_k\right) \le \frac{1}{n} \sum_{k=0}^n \operatorname{Var}(X_k) $$

Zoe Allen · Accepted Answer · 2024-07-07 17:55:04Z

A way to see this at a glance is that real random variables form an inner product space, with $\langle X,Y \rangle = \mathbb{E}XY$. The norm induced by this inner product is $\|X\|^2=\mathbb{E}X^2$, and both $\|X\|$ and $\|X\|^2$ are always convex for an inner product space.

Furthermore $\mathbb{E}X$ is linear, so $f(X)=X-\mathbb{E}X$ is linear, and a convex function composed with a linear transformation is always still convex.

This gives us that $$Var(X)=\langle X-\mathbb{E}X , X-\mathbb{E}X \rangle$$ is convex, from which your conjecture immediately follows.

Amir · Accepted Answer · 2024-07-08 18:43:27Z

$$\text{Var}\left(\sum X_i\right) = \sum\limits_i \text{Var}\left( X_i\right) +\sum\limits_i \sum\limits_{j\not=i} \text{Cov}\left( X_i,X_j\right)$$ is maximised when the covariances take their maximum possible positive values, which happens when all the correlations are $+1$.

So the highest variance case for $\sum X_i$ and thus $\bar X$ will be when there is perfect positive correlation between the $X_i$, in which case $$\text{SD}(\sum X_i) = \sum \text{SD}(X_i)$$ giving $$\text{Var}(\bar X) = \frac1{n^2} \text{Var}\left(\sum X_i\right)=\left(\frac1n \sum \text{SD}(X_i)\right)^2 .$$

Then, using the Cauchy–Schwarz inequality:

$$\left(\frac1n \sum \text{SD}(X_i)\right)^2 \le \frac1n \sum \left(\text{SD}(X_i)^2\right) = \frac{1}{n}\sum \text{Var}(X_i)$$

with equality only when all the $\text{SD}(X_i)$ are equal.

So your $\text{Var}(\bar{X}) \leq \frac{1}{n}\sum \text{Var}(X_i)$ is correct,

with equality only when $X_i-E[X_i]=X_j-E[X_j]$ for all $i,j$ so when you have identical variances and perfect positive correlation (though possibly different expectations).

While the situation described in the first sentence seems plausible, the description does not mathematically justify its correctness. — Greg Martin, Commented Jul 7 at 17:10
@GregMartin it is well known. I have added an additional introductory line — Henry, Commented Jul 7 at 17:20

Stack Exchange Network

Is the variance of the mean of a set of possibly dependent random variables less than the average of their respective variances?

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
probability
statistics
random-variables
variance
covariance
.

Linked

Hot Network Questions

Is the variance of the mean of a set of possibly dependent random variables less than the average of their respective variances?

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged probabilitystatisticsrandom-variablesvariancecovariance.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability
statistics
random-variables
variance
covariance
.