Here there and everywhere

So mathematicians will go on about all the cool properties of Gaussian distribution, for example, they are their own Fourier transform, their higher order cumulants are all zero, etc... but why should any half-way normal person be interested in them?

The answer lies in something called the "Central Limit Theorem". I'll describe the gist of it. You know back in section 1.3.8 when we were asking the question, what's the probability of getting $m$ heads when you toss a coin $n$ times. We saw that was equivalent to asking the question, what's the probability you'll make $$m$ when you toss a coin $n$ times given the rule that you make nothing if it lands tails and $1 if it lands heads. In this case you're summing up random variable $x_i$'s,

\begin{displaymath}
X = \sum_{i=1}^n x_i
\end{displaymath} (1.56)

This is the scenario we discussed in subsection 1.5.6, so the binomial distribution is just the probability of getting different values of $X$.

What I'm saying is that you can think of the binomial distribution as the probability distribution for the sum of a bunch of independent random variables. In this case one where each random variable takes the value $0$ with probability $1-p$ and $1$ with probability $p$.

We also discussed how it looked, but I left out an important observation: the binomial distribution looks more and more like a Gaussian distribution when the number of trials $n$ gets large! Check it out, in figure 1.3.9. we see a distribution that looks a lot like a brontosaurus. Let's line compare them on the same graph.

\begin{figure}\begin{center}
\epsfig{width=.4\textwidth,file=cmp.eps}
\end{center}\end{figure}
Above is a figure of the two distributions, the Gaussian (green dashed) and the binomial distribution of figure 1.3.9 (red impulses), on the same graph. Even here with $n=50$ (which isn't all that big), you can see that they line up right on top each other. The Gaussian was chosen to have the same mean and variance as the binomial distribution. These we computed earlier in sections 1.5.6 and 1.5.7.

So where do the two curves differ? A Gaussian distribution is nonzero all along the real line, whereas a binomial can't be nonzero for $m < 0$ or $m > n$. (If you do an experiment where you toss a coin 50 times, you can't get 51 heads unless of course you're including the pink elephants). So clearly at the tails of the distribution, these to distributions do differ. But how big is a gaussian in that region? It's exponentially small as you'll now verify in the following problem.



Subsections

Josh Deutsch 2009-03-05