The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Bootstrap and NHST
Replies: 0  

Advanced Search

Back to Topic List Back to Topic List  
Luis A. Afonso

Posts: 4,758
From: LIsbon (Portugal)
Registered: 2/16/05
Bootstrap and NHST
Posted: May 12, 2013 10:38 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Bootstrap and NHST

Because nowadays it is currently feasible to perform fast processing data via computers, the intensive data treatment by random simulation such that Permutation, Bootstrap, Jacknife, etc. had acquired increasing importance comparing with those using parametric procedures. The slow and limited capacity performance of mechanical machines used at Fisher´s time one hundred year ago has nothing to do with the modern computers even for PC´s. Furthermore the lack of Theoretical Probabilistic basis of a lot of procedures lead people to base treatments on numeric procedures, sometimes only naively intuitive, the computers are so able to solve.

__0__Directional aiming

I intend to analyse Bradley Efron and their co-authors work, manly in what concerns the 80´s papers, in order to get an idea how the classical methodology can be compared with this new standard fully adopted since then.

___1__One-mean Bootstrap
A common simple problem in Decision do occur when a Distribution mean is bounded, 2 tails, 1- alpha probability, using a normal n-size sample, namely:

xhat +/- T(n-1, alpha/2)*sqrt (ssd/n) (1)
xhat = observed mean,
ssd= sum square deviations about xhat.
T(n-1, alpha/2) = Student Distribution fractil alpha/2, n-1 df.

It worth to be noted that, once the size chosen, we are dealing with two sample variables, xhat and ssd, with different distributions, normal and chi-sq. More precisely the Distribution variance sigmasq is limited by:

ssd/Chi0 <= sigmasq<= ssd/Chi1 (2)
Chi0= Chi(n-1, alpha/2)
Chi1= Chi(n-1, 1-alpha/2),
while xhat follows a Normal Distribution.

In Parametric Statistics a two-tail *symmetric* 1-alpha C.I. is such that there is only alpha/2 probability the parameters value stay before the left´s bound and alpha/2 to stay after the right´s bound.
Given an n-size i.i.d. sample, the source S, a bootstrap sample is obtained by sampling at random, with replacement the n items of S. Therefore a Bsample can show from 1 to n repeated items from the source. It´s claimed, and certain theoretical results tends to confirm, that asymptotically the full Bset can be thought as equal the Population of samples, getting from the full Distribution when drawn one by one.
Among the multitude of persons which professional work is Data Processing there is at least one, me, which truly hate the term asymptotic and avoid, if possible, using such techniques. They are irreplaceable for Deductive Purposes but ominous for practical finite data.

The first conclusion about the dispersion is that (1) contains from 0.954 to 0.951 values of the Bootstrap set when the N(0,1) source-samples goes from n=100 to 400. The results are shown above, the frequencies from the left bound and from the right one as well. For each size 5 sources were obtained and 400´000 Bootstrap synthesized.
(Program <orchid>)








The problem is not sufficiently treated before we compare the Dispersion of Bootstrap mean values with the Parametric provided through Parametric methods.

Luis A. Afonso

DEFDBL A-Z: PRINT " ORCHID n= 100(50)400 "
DIM X(400)
T(1) = 1.9842: T(2) = 1.976: T(3) = 1.972: T(4) = 1.9695
T(5) = 1.9679: T(6) = 1.9668: T(7) = 1.9659
INPUT " size = "; n
5 kii = 2 * n / 100 - 1
IF INT(kii) <> kii THEN GOTO 5
IF kii > 7 THEN GOTO 5
IF kii < 0 THEN GOTO 5
INPUT " how many= "; all
pi = 4 * ATN(1)
msource = 0: sssource = 0
FOR i = 1 TO n
aa = SQR(-2 * LOG(RND))
X(i) = 0 + 1 * aa * COS(2 * pi * RND)
msource = msource + X(i) / n
sssource = sssource + X(i) * X(i)
ssd = sssource - n * msource * msource
st = SQR(ssd / (n * (n - 1)))
left = msource - st * T(kii)
right = msource + st * T(kii)
FOR rpt = 1 TO all
m = 0
FOR i = 1 TO n
gg = INT(n * RND) + 1
m = m + X(gg) / n
REM checking the Bmean
IF m < left THEN lft = lft + 1
IF m > right THEN rgt = rgt + 1
IF m > left AND m < right THEN u = u + 1
LOCATE 12, 43
PRINT USING " ######### "; all - rpt
LOCATE 14, 40
PRINT USING "#.#### "; lft / rpt; u / rpt; rgt / rpt
NEXT rpt
PRINT USING "##.#### "; T(kii)

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.