Google
 
Sponsored Ads
Free Statistics Homework Help

Deceptive Correlations

Now we can get back to statistics. We learned in section 1.8 how to quantify how one variable correlates with another. Lots of predictors are based on these ideas of statistical correlations. If you look for how things like price increase, and the number shares on the bid and ask, at one time correlate with the price in the future, you find there is a sizable correlation.

Say we consider three sequential changes in the price of a stock $p_ {before}$, $p_ {now}$, and $p_ {after}$. Then we can define $\Delta p_{last} = p_{now}-p_{before}$ and $\Delta p_{next} = p_{after}-p_{now}$. You can also look at the depth on the bid and the ask. There is a clear positive value for the correlation coefficient when calculating averages like this.

A positive correlation is only between price movements differing by short times, of order seconds, but it's enough to make you think that you can make billions of dollars in a year, starting with a few bucks, (by buying and selling like a maniac hundreds of times a day). You'd have to be an extreme optimist to think that such a simple prediction approach would work. And of course it doesn't work in actual practice. The reason why is informative. You don't get the trades you think you're going to get.

When you communicate with a market maker and ask an order to be filled at the price he's advertizing, he doesn't necessarily have to fill it. In many circumstances, he can "back away". And he doesn't have to decide immediately what to do. He can wait about 15 seconds before deciding. Now that's a long time on the time scale of an active stock. Lot's of things can change in that time. The price can easily change in that time. If it looks like you'll win and he'll lose by the trade, he won't fill it. But if you're wrong, he'll happily fill it for you. This way he makes money and you don't.

This shows you've got to be very careful in making conclusions based on the mathematical quantities you're measuring. You can get a hold of lots of stock market data, for example monthly "TAQ" data, for the New York stock exchange, which lets you NASDAQ and N.Y. stock prices analysis. It's fine to measure the 5th order cumulant of price differences, but so what? You're interested in profit which doesn't involve only the price you get from the data. It involves quantities not available in the data at all: will the market maker fill your order?

There are ways around this problem. One is to trade only with ECN's, that tend to be much more straightforward about filling your orders. But the main problem again, is one of competition. You're competing with zillions of professionals that do the same thing. And they often make their decisions electronically with very fast communication links meaning they'll be able to beat you to the punch every time you think you've found a statistically profitable trade.

All this applies to very short times, of order milliseconds and seconds. But a lot of people like to invest on longer time scales, days, weeks, months, and years. Since the correlation of stock price at the present with price in the future decays quickly with time, over order minutes, why should anyone think there is anything left after a week? Well there probably isn't by looking at stock price. You have to start thinking about other things, like the fundamentals of the company, news and other things. Lot's of people don't do this, and think they make big bucks on these longer time scales. Often they lack a basic understanding of statistics on probability, and would do well to read this site carefully. Is their profit statistically significant? Have they even tried to do the calculation? Almost always, they're not interested in answering this question scientifically. Perhaps they secretly believe that they have some unique talent makes their investment skills superior to the army of Wall St. quants looking at the same problem.

josh 2010-10-20