Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: Computers and AP
Replies: 0

 Search Thread: Advanced Search

 Richard Scheaffer Posts: 440 Registered: 12/6/04
Computers and AP
Posted: Apr 26, 1996 4:23 PM
 Plain Text Reply

WHY COMPUTERS IN INTRODUCTORY STATISTICS?

Richard L. Scheaffer
University of Florida
Chief Faculty Consultant, AP Statistics

In recent years, much discussion has take place around the role of
computers in teaching introductory (pre-calculus) statistics. Since the AP
Statistics Course Description has been out, this discussion has broadened
to include a new audience of high school teachers. I have an abiding
interest in teaching introductory statistics and had something to do with
that Course Description, and so I will add my thoughts and opinions to
this discussion.

In recent times the argument has moved from "some technology v. no
technology" to "computer technology v. graphing calculator technology."
This is a positive step, for those of us who have been around awhile, for
we still have colleagues who insist on students remembering the "short-
cut" formulas for calculations of certain statistics, and make them work
through numerous examples by hand. These hand calculation skills will
be of little help on an AP exam. All formulas used on the exam are given
in the form that was thought to be most meaningful for understanding the
underlying concept, not for simplifying calculations.

Can a student do well in introductory statistics without ever touching a
computer? Yes. Can a student perform well on the AP Statistics exam
without having computer experience? Yes. In fact, a student who
understands statistics can do well on an AP exam with just a scientific
calculator and may do OK with NO calculator. Calculation is not the key
to success here! The policy is, though, that students are expected to have
a graphing calculator for the exam.

The AP Statistics Course Description recommends, however, that students
get some experience with modern statistical software sometime during the
course. Why do I think this is a sound and reasonable policy for a course
of the type we are promoting here?

The AP course emphasizes data collection, summarization and analysis as
the basis for decision making under uncertainty. It is designed to be a
course about the practice of modern statistics, but taught so that the
practitioners understand the underlying concepts that are at work. Since
we are not going to prove any theorems, the only way for students to
understand these concepts is to provide them with empirical evidence.
That empirical evidence comes about most efficiently and effectively
through the use of a computer. I will embellish this general comment in
just two areas, exploratory data analysis and simulation in inference.

Exploratory data analysis is much more than drawing a boxplot or two,
which can be done on a graphing calculator (albeit without scales on the
axes). It is sometimes defined as the art of seeing into the data through
revelation, residuals, re-expression, and resistance. Revelation comes
about first by looking at various plots of the data (stemplots, boxplots,
dotplots, scatterplots, matrix scatterplots, three-dimensional plots, etc.).
Modern software has a host of plots most students have never seen before
and allows for the tailoring of these plots to emphasize certain features of
the data. Also, the plots might be linked so that a potential influential
observation highlighted in a scatterplot will show up on a histogram or a
stemplot, or in the data set itself, allowing connections to be made.
Exploration is only of interest, however, on real data sets, many of which
are too large to be entered into a graphing calculator (although this is only
a temporary problem).

Residuals have to do with fitting a model to the data and looking at the
difference between the model predictions and the observed data points.
Modern computer software will allow rapid fitting of a wide variety of
models very quickly, and will automatically store the residuals and
standardized residuals for future analysis, including the exploratory plots
mentioned above. A key concept of model fitting is that of influential
observations. Points that do not fit the pattern can be isolated, moved or
deleted quickly and the effect on the fit of the model can be readily
observed. In fact, some software programs allow the regression line to
move about on the screen while points are being added, deleted, or moved.
Such dynamic demonstrations are very effective in teaching concepts.

Re-expression means transformations. Data can be transformed by a wide
array of built-in functions, stored and used in model fitting, often with just
a single command, by modern statistical software. The linking of data sets
allows for dynamic changes to, say, a scatterplot to be viewed on the
screen while the transformation is taking place. This is one of the most
effective ways I've ever found to show students what a power
transformation does.

Resistance means making use of statistics that protect the analysis against
unusual data points, like using the median rather than the mean as a
measure of center. Modern software includes a variety of techniques for
fitting resistant models, including resistant regression lines and time series
smoothing.

This leads me to the next section, the use of simulation in inference.
Everyone has a favorite technique for illustrating the Central Limit
Theorem for means of random samples through a simulation. At this
point, these are cumbersome to carry out on a graphing calculator,
although simple ones can be done. These types of simulations can easily
be extended to simulations of the behavior of confidence intervals for a
mean. Suppose we want to illustrate how the sampling distribution (and
confidence interval) for the median compares to that for the mean. Easily
done on most computers. Similarly, suppose we want to illustrate the
sampling distribution of the maximum of a sample, or the correlation
coefficient, or the sample standard deviation. All of these are easily
accomplished with modern software, and all add to the learning experience
of the students.

Now, I can here the argument that much of this material mentioned above
is not in the AP outline. We do not have to do sampling distributions for
the median or for a maximum, for example. True enough. But, the idea
of an AP course is to make it an enriching experience for the student.
Having some experience with medians and maxima, for example, will help
the understanding of the concept of sampling distribution and will show
how techniques generalize to a larger class of applications. (Students who
study French learn something about English. Students who study physics
learn a little about applied mathematics.)

There is, in addition to the above, the small point that computer experience
will help the student with his or her college work in statistics. A student
may get by in high school by completing all data analysis assignments on
a calculator, but that is not going to happen in college.

The real question, then, is "What do you want your students to get out of
the course?" If a passing grade on the AP exam is the only objective, then
the computer is not essential. If the goal is to present an interesting,
lively, enriching modern course that will help students pass the AP exam
and understand concepts that will improve their practice of statistics in the
future, then the computer becomes essential, in my opinion. The question
is not one of using a sledgehammer on a tack. It is one of understanding
when, why and how to use a tack as opposed to when, why and how to
use a staple or a spike.

For reference. see Hoaglin and Moore, Perspectives on Contemporary
Statistics, MAA Notes no. 21, 1992.

© The Math Forum at NCTM 1994-2017. All Rights Reserved.