Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



Re: The power of a test
Posted:
Jun 8, 2013 11:26 PM


On Sun, 09 Jun 2013 01:10:50 +0200, Cristiano <cristiapi@NSgmail.com> wrote:
>On 08/06/2013 1:56, Rich Ulrich wrote: >> The text says that this is the empirical result of 1000 >> random samples. Apparently, there were 38 "rejections" >> rather than 50, in this particular Monte Carlo experiment. >> >> The 95% CI for 50 (as a Poisson, small count) is about (37, 65) >> so this result is not too unusual. > >Can we suppose that they used an approximate critical value?
No. Why would they do that? JB is old enough that I expect that the cutoffs they used are precise.
>I think that, because they say: "In this way, the ?rst column depicts >the *actual size of the tests*, while the other columns represent their >power.", which could be interpreted as: "using a good estimation of the >critical value for JB, we get the actual size of the test with G_0,0.0". >Does it make any sense?
No.
They do a Monte Carlo randomization of 1000 trials under the null condition; they use cutoffs for a test; they report the observed "size" of the test for this test. Since they were looking at a 5% tail, they expected 50, and this set of 1000 trials happened to give them 38  The other tests were computed on the same set of 1000 and gave the same or off by 1 in the count of rejections. They were all the same.
You need to study, read papers, contemplate, whathave you, to get used to this terminology and set of expectations. For instance, it is technically a BAD performance for a RNG if it does not produce "random variation" in every criterion that you measure it against. And, 5 times out of 100, a set of 1000 trials will result in less than/ greater than the number of rejections specified by the CI  if you have a decent RNG.
The later columns in the table are called "power" because (a) every sample is nonnormal, by design, and (b) they represent how often that nonnormality is detected.
I hope that you recognize that this is a peculiar paper, in a way. Most of the time, authors are proposing ways to *detect* differences. These authors are proposing "robust measures" that will NOT report samples as non normal when they are "merely" contaminated by outliers of one sort or another. Thus, they are happy and proud to point to the lack of power for detecting the specified sorts of contamination, for certain "robust" tests.
If you want a test to detect nonnormality in the form of those contaminations ... the JB does fine. The power of the socalled robust tests is going to be concentrated elsewhere. Presumably.
I'm not sure I see much value in their paper. If the JB rejects severely, and their robust tests do not, you might conclude, "Well, the sample would be pretty normal if it were not for 1%/5% contamination."
But then... I've never, ever, paid much attention to any formal tests of normality. Maybe it is a lot more useful to someone who has had data where other tests of normality were useful.
 Rich Ulrich



