Saturday, December 28, 2013

Statistical Significance - Again

With all of this emphasis on "Big Data", I was pleased to see this post on the Big Data Econometrics blog, today.

When you have a sample that runs to the thousands (billions?), the conventional significance levels of 10%, 5%, 1% are completely inappropriate. You need to be thinking in terms of tiny significance levels.

I discussed this in some detail back in April of 2011, in a post titled, "Drawing Inferences From Very Large Data-Sets". If you're of those (many) applied researchers who uses large cross-sections of data, and then sprinkles the results tables with asterisks to signal "significance" at the 5%, 10% levels, etc., then I urge you read that earlier post.

It's sad to encounter so many papers and seminar presentations in which the results, in reality, are totally insignificant!

© 2013, David E. Giles