. . . intending to show that all Decisions or Statistical Inference is casual, has always a risk (probability) to be wrong, because one cannot extract certitude from uncertainty. . . ? Probabilidades e Estatística, Conceitos, Métodos e Aplicações, McGraw Hill, 1990.
There has been reported a few critical remarks and misinterpretation as well regarding the test hypotheses the most part unreasonable as we shall see.
What aims the Classical Test Hypothesis Project?
Based on the more extreme value the test statistics can be observed in case the null hypothesis is true, one is able to find a criterion to decide to reject or to keep it.
Naively trusting on forthcoming generation?s sagacity the founders of the Significance Tests never though, surely, that someone could read/interpret literally H0: p1 = p2 (two tailed test) as to search the two parameters (exact) equality. Everyone knows it is clearly impossible to reach such a result. In fact the most one can achieve is that we do not find sufficient evidence to deny the null. If not the alternative hypotheses must be preferred: p1 and p2 are appears to be such different that H0 is unlikely given the data and gauged by the significance level alpha. Therefore what?s important to stress is that the true/false absolute paradigm must be switched, everywhere and forever, on likely/unlikely. So Karl Popper?s idea that all assertion must be true or false there is any utility here: this is not the Universal; all swans are white, promptly denied when a black swan is found (Falsifiability, Refutability principle). Here swans are grey in different intensities. . . What we must ask to data is in what degree is likely to reject the null hypothesis and accept the alternative. Are we risking to be mistaken? Of course! We even chose alpha, the probability - alpha - to reject the null when it is TRUE! Besides there is a point that should be carefully tackled because its fragility: the p?value, the more extreme value observable in case the null hypothesis is true, and the criterion p< alpha to reject H0. Nobody doubts that is quite risky to estimate p from the observed test statistics value. However, surely, we cannot go further - excluding perhaps test repetition involving more data - than to adopt this quite artificial convention. In spite of being a real weakness cannot be overlooked; to claim that it is deadly wrong seems somewhat exaggerated.