Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



normal probability plots
Posted:
Mar 4, 1997 8:07 PM


Rex Boggs said:
> I must confess that, while I have some understanding about how to > interpret a normal probability plot, I have absolutely no idea how to > construct one for a particular data set. As a teacher who may has used > them to justify using the ttest, this makes me very uncomfortable, > especially when being asked how this plot was constructed and having to > profess ignorance. > > Is it possible to explain how to do this in an email? Or is there a > website that I can visit?
Here's a simple data set that my notes say came from the Minitab Reference Manual, release 10.5. (I don't have the manual here.) X: .1, .9, 1.1, 1.8, 2.3 Refer to these data points, in order, as x_i, with i = 1,2,3,4,5.
Now sketch a picture of the normal curve. If the X values are normally distributed, you might expect them to occur at, say, the 10th, 30th, 50th, 70th, and 90th percentiles (i.e., at the zvalues with cum probs of .1, .3, .5, .7, and .9). Use a normal table or the TI83 to look up the zvalues for these percentiles. I get about 1.28, .52, 0, .52, and 1.28. Construct, by hand, an ordinary "xy plot" of the five points (X,z). That's a normal probability plot.
Now compare the handdrawn result with the following Minitab plot; they should be pretty similar.
MTB > set c1 DATA> .1 .9 1.1 1.8 2.3 DATA> end MTB > nscore c1 c2 MTB > name c1 'X' c2 'Nscore' MTB > print c1c2
< data display temporarily omitted; see below > MTB > GStd. MTB > Plot 'Nscore' 'X'; SUBC> Symbol 'x'. Character Plot
Nscore   x  0.80+   x   0.00+ x    x  0.80+   x  +++++X 0.40 0.80 1.20 1.60 2.00
MTB > GPro. MTB > nooutfile
One last matter... For a data set of size n = 5, as given above, think about a formula that produces the percentiles in this example:
i 1 2 3 4 5 j .1 .3 .5 .7 .9
A little thought shows that j = (i.5)/n. Various groups have chosen slightly different formulas for j, yielding slightly different normal scores for the plot. According to my notes, Minitab uses (i  3/8)/(n + 1/4) and Data Desk uses (i  1/3)/(n + 1/3). Each of these choices yields slightly different cum probs; for example, here are the Nscores generated by Minitab (which I moved from the plot above):
Data Display Row X Nscore
1 0.1 1.17877 2 0.9 0.49532 3 1.1 0.00000 4 1.8 0.49532 5 2.3 1.17877
To summarize,
my example above: 1.28 0.52 0.00 0.52 1.52 Minitab: 1.18 0.50 0.00 0.50 1.18 Data Desk: 1.15 0.49 0.00 0.49 1.15
But I think you will see that each of these three choices gives essentially the same plot.
Hope I've got that right, and that it helps
============================================== Bruce King Department of Mathematics and Computer Science Western Connecticut State University 181 White Street Danbury, CT 06810 (kingb@wcsu.ctstateu.edu)



