Date: Oct 10, 2013 4:38 PM
Author: Paul
Subject: small confidence limit to demonstrate inability to reject H0?

I've had my head into intro stats and simple hypothesis testing for about a year.  It all seems to make sense, especially if you want to demonstrate an effect for a drug and H0 corresponds to no effect.  For example, if you set a 95% confidence limit, then the rejection region for the H0's PDF is 5%.  If your sample statistic lies outside of the confidence limits, then under the scenario that H0 is true, there is only a 5% chance that you could have gotten your result by chance.  The closer you set your confidence limit to 100%, the less likely that a result outside of that limit is due to chance, under the H0 scenario.  So you're more justified in rejection H0.

What I don't get is the common practice of choosing high confidence limits when you're interested in showing that rejection of H0 is *not* justified. For example, at http://www.r-bloggers.com/the-many-uses-of-q-q-plots, 95% confidence bands are chosen for a Q-Q plot, with H0 being that the data is normally distributed. To highlight why I'm confused by this, lets consider the extreme case of 99.99% confidence limits. That means the rejection region is only 0.001%, and the bands widen out. Almost any scattering of data will fit within the confidence limits. So if all my data points are within the confidence limits, it says very little. In contrast, if I chose 50% confidence limits, having all my data points within those limits is a harder test to meet. I know that hypothesis testing does not allow one to determine whether H0 is true, but having all data points within 50% confidence limits allows one to definitively rule out any evidence against H0, which is a lot more info than what can be gleaned from 99.99% confidence limits. From browsing the web, weak evidence starts to buildup if p-values drop as low as 0.10.

So if one wants to demonstrate a lack of evidence against H0, then why wouldn't one choose to use as low a confidence limit as possible? I described my confusion in the context of Q-Q plots, but the same conceptual questions dogs me for simple scalar hypothesis testing.