|
Next: Linear Regression Up: Correlations Previous: Correlations Contents Correlation vs. Causation
Smoking causes lung cancer. That's hardly a controversial statement anymore. But how do you know that people that get addicted to smoking have a genetic difference that predisposes them for lung cancer? You'd have to do an experiment to control for that or come up with a clear medical explanation of the carcinogenic effects of smoking. Of course that can be done, and no one is going to take the genetic argument seriously.
But it does raise a serious question about how to use correlation measurements
to draw inferences about things effecting each other. In the example of
smoking and cancer, you can define a variable
Let's take some other ludicrous examples to explain
the problem of correlation vs. causation. Define
How about anti-baldness lotion. Define
Same with diet food. Diet food being defined here, not as lettuce but
those premade meals you find in the frozen section with "diet" or "low fat"
written all over them. I bet you'll find that people that eat diet food
tend to be fatter than those that don't.
Define Now how about ant poison. How many people without ant problems in their house have a lot of ant poison on the floor? How about those with a big ant problem? Does this imply that ant poison is causing the ant problem? Not unless you believe the CEOs of these poison companies are just big ants in disguise, doing their best to keep their relations well fed. The upshot of all this is that causation and correlation are very different. Diets, ant poisons, anti-baldness balms, all are doing what they're suppose to be, not the opposite. Causation causes correlation, but not the necessarily the converse. There is lot more to proving causation than this simple correlation formula, and that's why you've got to be very careful reading news stories, or even medical journals that purport to show that A causes B.
Next: Linear Regression Up: Correlations Previous: Correlations Contents
Keywords:
|