Precise mathematical translation of the 68–95–99.7 rule?(Not a proof!)
The rule:
In statistics, the 68–95–99.7 rule, also known as the three-sigma rule or empirical rule, states that nearly all values lie within 3 standard deviations of the mean in a normal distribution.
About 68.27% of the values lie within 1 standard deviation of the mean. Similarly, about 95.45% of the values lie within 2 standard deviations of the mean. Nearly all (99.73%) of the values lie within 3 standard deviations of the mean.
So suppose that I have a set of values (measurements) which has the normal distribution property. Let's call it S.
When they say "about 68.27% of the values" what values do they mean? Do they mean that the standard deviation of any 68.27 % of the elements of S is smaller than 1? Do they mean something more? Could someone give me a precise mathematical statement that is equivalent to this "68–95–99.7 rule".
I've posted this on math.stackexchange because I would like a mathematical answer.
$\endgroup$ 13 Answers
$\begingroup$The mathematical statement of the "within one standard deviation" rule is that
$$\Pr(\mu-\sigma < X < \mu + \sigma) =\frac{1}{\sqrt{2 \pi} \sigma} \int_{\mu - \sigma}^{\mu + \sigma} \exp \left( - \frac{(x-\mu)^2}{2 \sigma^2} \right) \; dx = \frac{1}{\sqrt{2 \pi}} \int_{-1}^1 \exp \left( - \frac{u^2}{2} \right) \; du \approx 0.682689$$
(In the integral, just make the substitution $u = (x-\mu)/\sigma)$.)
Is that what you had in mind? The other statements are similar, just replacing $\sigma$ with $2 \sigma$ or $3 \sigma$.
$\endgroup$ $\begingroup$Precisely, they mean that if you could observe an "infinite" number of values from your normal distribution, 68% would be within 1 standard deviation of the mean, as specified by the parameters of your distribution, 95% would be within 2 standard deviations, and 99.7% within 3 standard deviations.
Of course, you cannot take an infinite number of observations. But the larger the finite number of observations you can take, the closer your results will be to the infinite case. If you only take a very few number of observations, the results could be very different from the ideal you ask about. But if you had hundreds of observations, you'll be surprisingly close. This is the content of the Law of Large Numbers.
$\endgroup$ 5 $\begingroup$If $X$ is a normally distributed random variable, then $$\Pr\left(\left|\frac{X-\mu}{\sigma}\right|\le 1\right)\approx 0.6826.$$ Here $\mu$ is the population mean, and $\sigma$ is the population standard deviation.
Similar facts hold for the other two numbers you mentioned.
If we do repeated independent sampling, that can be represented as a sequence $X_1,X_2,X_3,\dots, X_n$ of independent random variables with the same mean and variance. If $n$ is large, then with reasonably probability the proportion of the sample results that lies between $\mu-\sigma$ and $\mu+\sigma$ will be not far from $68\%$. However, even with $n$ around $1000$, we can only be about $95\%$ sure that the experimental proportion will be between $65\%$ and $71\%$.
$\endgroup$