> > Now to box plots. We teach students to draw box plots by hand, > initially. To keep things simple, we draw our whiskers right out > to the furthest outlier. (Keep in mind our datasets are small.) > But I've noticed that computer packages are more sophisticated - > they draw to whiskers out to say 1.5 SDs
probably 1.5 IQRs, and they use these to identify outliers, NOT as the endpoints of the whiskers, which are at the most extreme NON-outliers
away from the mean, and then > use synmbols to represent outliers (eg dots for 'mild' outliers and > squares for 'severe' outliers). > > I think this is great for a computer to do, but what about for kids? > Isn't it sufficient to just determine the 5-number summary, and plot > those values? Some information is lost, but much time is saved.
Of course, the biggest time-saver is to not look at the data at all!
One of my biggest concerns about the changes I see in statistical education is that many of the materials I see lose track of WHY a topic is there to begin with. For example, some of the best-selling texts by non-statisticians are starting to include boxplots -- but they never DO anything with them. It reminds me of the "new math" years when sets were introduced to "unify" mathematics -- but were only used in a unit on sets! In that implementation, they obviously did not serve the intended purpose!
So, WHY are we doing boxplots, anyway? (If the materials you use for your own study or with your students do not answer such questions, you may need to look at other materials.) I see two reasons:
1. They are the standard tool for graphically comparing multiple groups. For just two groups, back to back stem and leaf displays (or more primitive histograms) are a possibility, but for more than two groups these are hard to align. Dotplots and stem plots often contain more detail than you want to see when there are multiple groups.
2. They are the standard tool for preliminary outlier identification.
If you do the "quick" boxplots described above, you lose item 2, or half the reason for doing boxplots in the first place. If you don't want to use the standard tool, you then need to ask what tool are you going to use instead? IF you just do the five number summary and do not talk about least squares statistics (means, variances, standard deviations, correlations, regression lines, etc.) then you can get by for a while not saying too much about outliers. However, once you start doing means and standard deviations, you need to talk about their lack of robustness to outliers. Because the least squares techniques are more familiar to most non-statisticians, there is a tendency to regard them as more basic or more important, rather than as specialized techniques for more or less normally distributed data that is free of outliers. Because they are limited and specialized, you need to talk about the concepts (normaility, skewness, outliers, etc.) that govern their use. Not doing so is like teaching mathematical theorems without stating the hypotheses that tell you when the theorem is true. (Can't we leave those to a later course?)
High school teachers may be more familiar with the underlying philosophical issues as they apply to graphing calculators. Are these "add ons" that enable you to do the SOS faster, or are they tools to revolutionize the way we teach mathematics? A lot of the stats. materials I see treat boxplots, etc., as add-ons or alternate ways to do the SOS (or as things we do because some curriculum guice said we had to) rather than as tools that revolutionize the way we handle data. (Note that the statistics revolution was decades before the calculator revolution.) --
_ | | Robert W. Hayden | | Department of Mathematics / | Plymouth State College | | Plymouth, New Hampshire 03264 USA | * | Rural Route 1, Box 10 / | Ashland, NH 03217-9702 | ) (603) 968-9914 (home) L_____/ firstname.lastname@example.org fax (603) 535-2943 (work)