|
Let's get back to the putative difference in height between Swedish
men and women. We have to form a statistic to use on the data. The
obvious one is the difference in the means between the men and the women.
You have 25 men and 25 women.
Let's call the mean you measured for the men using eqn. 2.1 ;
for women, call it .
Let's call the difference
. Remember this isn't the true
difference you'd get if you used the whole population, only an estimate.
Maybe you get
. Remember that if we did the experiment again we might get or even .
How do we calculate the likelihood of finding a difference of
if women and men have precisely the same distribution of height (that is if is true)?
To figure that out, we have to know what the probability distribution
is for our estimate of the mean difference .
But assuming, like we are, that there's no difference in the two distributions,
the true is 0. In that case we have some brontosaurus-like curve
for the probability of mean-differences.
Let's plot what we'd expect it to look like assuming :
The actual functional form of this is quite subtle and something that I'll
discuss a little later. Let's try to understand the basics first.
Assuming (no difference between heights of men and women),
it's most likely that you'll get a difference of .
But we've found a difference of . That doesn't seem
so likely. How likely is it? Well we're asking slightly the wrong question.
The probability of getting exactly some number is 0 for a continuous
distribution. We want to know what's the probability we'd get or something
even more unlikely, like . So what we really want to know is
the probability of finding a value of or greater by chance.
We already know how to calculate this, from eqn 1.11, it's just the area under the curve:
So the area in red is the probability that you'd get a value of or greater
by chance, while in fact there's actually no difference.
If the area in red is less than your threshold , say then you'd say
"the probability that there's no difference in the height statistics between
the groups is small therefore the assumption of no difference is unlikely to be true".
So in that case you conclude there's a statistically significant difference between the heights of men
and women based on your data.
If the area in red is greater than your threshold , say then you'd say
"the probability that there's no difference in the height statistics between
the groups isn't that small, therefore we can't reject the assumption that there's no difference".
In that case you conclude that the difference in the height is not statistically significant.
Of course if you used 5000 measurements instead, you might then change your tune. You do the
best you can based on the available data.
You might get ridiculed for saying that there's no statistically
significant difference in this case. "Come on, just look down
the street. You can clearly see an average difference in height",
or "everybody knows that there's a difference". But that kind of talk
is what kept doctors blood-letting for centuries. They had some wacky
ideas about the four humours, blood, bile, and so on, whose relative
proportions told you about a person's mood and health. Of course they
knew it worked. Do you think anyone did a proper study of this? They just
thought from theory and empirical observation that it worked. It worked
in the sense that some ill people would sometimes get better despite
being exsanguinated. But did it work better than nothing? These days,
people have decided it wasn't such a great idea. Now medicine routinely uses, or
attempts to use, statistics as a way of devising new treatments.
josh
2010-10-20
| |
|