Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Assessing Normality y
Posted:
Oct 3, 1996 12:39 PM


 Forwarded message from Josh Tabor  Although assessing normality (normal quantile plots) is not listed as a topic for the AP exam, many books, including mine (IPS) cover it. Is it really necessary? Looking at the histogram seems to be good enough, although maybe I'm being a bit naive. If it should be done, are there any good data sets which give a clearly nonlinear plot? I tried rather unsuccessfully to make one up myself. The outliers, no matter how far away from the rest of the data, still seemed to be on the line. It left me rather frustrated and ready to throw out the whole topic.
 End of forwarded message from Josh Tabor 
Let's try a Cauchy distribution. I'll insert comments in CAPS.
*******************************************************************
MTB > randomly generate 50 observations in c1; SUBC> t dist with 1 df. MTB > print c1 C1 1.6483 2.7595 1.1327 2.5016 2.3328 7.7386 0.2006 0.7827 11.8376 0.6020 0.1113 1.2924 0.6804 3.6877 0.5656 0.9823 2.4857 0.6551 68.3163 0.6626 1.6935 13.8105 7.3409 0.4212 0.5819 0.0112 0.9809 0.6411 0.0293 0.5744 0.0272 0.9101 5.8839 0.4478 0.1435 2.7998 2.0776 0.0468 0.7280 1.4928 0.3862 2.0653 4.0453 3.2031 0.5641 3.4721 3.6683 1.9037 1.1137 38.4306 MTB > stem c1 Stemandleaf of C1 N = 50 Leaf Unit = 1.0 1 6 8 1 5 1 4 1 3 1 2 1 1 25 0 422222111100000000000000 25 0 0000000000111223333577 3 1 13 1 2 1 3 8
WE SEEM TO HAVE SOME OUTLIERS. MTB > nscores c1 in c2 MTB > plot c1 c2  35+ *  C1   * *  *** ** 0+ * ** *******2**2*22*2*22*2**2****  *    35+     70+ * +++++C2 1.60 0.80 0.00 0.80 1.60 THE NORMAL PLOT SHOWS THE OUTLIERS BUT IT ALSO SHOWS SOME CURVATURE WHICH IS NOT DUE TO OUTLIERS. LET'S REMOVE THE BIGGEST OUTLIERS AND SEE WHAT HAPPENS.
MTB > copy c1 into c4; SUBC> omit row 19. MTB > stem c4 Stemandleaf of C4 N = 49 Leaf Unit = 1.0 24 0 422222111100000000000000 (19) 0 0000000000111223333 6 0 577 3 1 13 1 1 1 2 1 2 1 3 1 3 8 MTB > copy c4 into c5; SUBC> omit row 49. MTB > stem c5 Stemandleaf of C5 N = 48 Leaf Unit = 1.0 1 0 4 6 0 22222 24 0 111100000000000000 24 0 0000000000111 11 0 223333 5 0 5 4 0 77 2 0 2 1 1 1 1 3
MTB > nscores c5 in c6 MTB > plot c5 c6  *  12.0+ *  C5    * * 6.0+ *   * ***  **  **2*** 0.0+ *2*2*2*2**  * 2****2  * * * ***  *  +++++C6 1.60 0.80 0.00 0.80 1.60 THE STEM AND LEAF SUGGESTS AN OUTLIER PROBLEM. THE NORMAL PLOT INDICATES THAT THE "OUTLIERS" ARE ACTUALLY PART OF THE OVERALL PATTERN HERE, WHICH IS A CURVE WITH A POSITIVE SECOND DERIVATIVE.
_   Robert W. Hayden   Department of Mathematics /  Plymouth State College   Plymouth, New Hampshire 03264 USA  *  Rural Route 1, Box 10 /  Ashland, NH 032179702  ) (603) 9689914 (home) L_____/ hayden@oz.plymouth.edu fax (603) 5352943 (work)



