Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Assessing Normality y
Replies: 0

 Bob Hayden Posts: 2,384 Registered: 12/6/04
Assessing Normality y
Posted: Oct 3, 1996 12:39 PM

----- Forwarded message from Josh Tabor -----
Although assessing normality (normal quantile plots) is not listed as a
topic for the AP exam, many books, including mine (IPS) cover it. Is
it really necessary? Looking at the histogram seems to be good enough,
although maybe I'm being a bit naive. If it should be done, are there
any good data sets which give a clearly non-linear plot? I tried
rather unsuccessfully to make one up myself. The outliers, no matter
how far away from the rest of the data, still seemed to be on the line.
It left me rather frustrated and ready to throw out the whole topic.

----- End of forwarded message from Josh Tabor -----

Let's try a Cauchy distribution. I'll insert comments in CAPS.

*******************************************************************

MTB > randomly generate 50 observations in c1;
SUBC> t dist with 1 df.
MTB > print c1

C1
-1.6483 -2.7595 -1.1327 -2.5016 -2.3328 7.7386 -0.2006
-0.7827 11.8376 -0.6020 -0.1113 1.2924 -0.6804 3.6877
0.5656 0.9823 2.4857 0.6551 -68.3163 0.6626 1.6935
13.8105 7.3409 -0.4212 0.5819 0.0112 -0.9809 -0.6411
-0.0293 0.5744 -0.0272 -0.9101 5.8839 0.4478 -0.1435
2.7998 -2.0776 0.0468 0.7280 -1.4928 -0.3862 -2.0653
-4.0453 3.2031 -0.5641 3.4721 3.6683 -1.9037 1.1137
38.4306

MTB > stem c1

Stem-and-leaf of C1 N = 50
Leaf Unit = 1.0

1 -6 8
1 -5
1 -4
1 -3
1 -2
1 -1
25 -0 422222111100000000000000
25 0 0000000000111223333577
3 1 13
1 2
1 3 8

WE SEEM TO HAVE SOME OUTLIERS.

MTB > nscores c1 in c2
MTB > plot c1 c2

-
35+ *
-
C1 -
- * *
- *** **
0+ * ** *******2**2*22*2*22*2**2****
- *
-
-
-
-35+
-
-
-
-
-70+ *
--------+---------+---------+---------+---------+--------C2
-1.60 -0.80 0.00 0.80 1.60

THE NORMAL PLOT SHOWS THE OUTLIERS BUT IT ALSO SHOWS SOME CURVATURE
WHICH IS NOT DUE TO OUTLIERS. LET'S REMOVE THE BIGGEST OUTLIERS AND
SEE WHAT HAPPENS.

MTB > copy c1 into c4;
SUBC> omit row 19.
MTB > stem c4

Stem-and-leaf of C4 N = 49
Leaf Unit = 1.0

24 -0 422222111100000000000000
(19) 0 0000000000111223333
6 0 577
3 1 13
1 1
1 2
1 2
1 3
1 3 8

MTB > copy c4 into c5;
SUBC> omit row 49.
MTB > stem c5

Stem-and-leaf of C5 N = 48
Leaf Unit = 1.0

1 -0 4
6 -0 22222
24 -0 111100000000000000
24 0 0000000000111
11 0 223333
5 0 5
4 0 77
2 0
2 1 1
1 1 3

MTB > nscores c5 in c6
MTB > plot c5 c6

- *
-
12.0+ *
-
C5 -
-
- * *
6.0+ *
-
- * ***
- **
- **2***
0.0+ *2*2*2*2**
- * 2****2
- * * * ***
- *
-
--------+---------+---------+---------+---------+--------C6
-1.60 -0.80 0.00 0.80 1.60

THE STEM AND LEAF SUGGESTS AN OUTLIER PROBLEM. THE NORMAL PLOT
INDICATES THAT THE "OUTLIERS" ARE ACTUALLY PART OF THE OVERALL
PATTERN HERE, WHICH IS A CURVE WITH A POSITIVE SECOND DERIVATIVE.

_
| | Robert W. Hayden
| | Department of Mathematics
/ | Plymouth State College
| | Plymouth, New Hampshire 03264 USA
| * | Rural Route 1, Box 10
/ | Ashland, NH 03217-9702
| ) (603) 968-9914 (home)
L_____/ hayden@oz.plymouth.edu
fax (603) 535-2943 (work)