Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
|
|
Siegel&Morgan review
Posted:
Apr 26, 1996 1:30 PM
|
|
Book Review
> Statistics and Data Analysis: An Introduction (2nd ed.) > Andrew F. Siegel and Charles J. Morgan > Wiley, 1995, $68, ISBN 0471574244 > > This book is the second edition of an underground classic first > published in 1988. The first edition (by Siegel alone) was > reviewed for Volume 26 of STN (December 1990) by Joan Garfield. > She used it for many years at the University of Minnesota, and > colleagues and I used it at Plymouth State College until last > summer. It recently showed up as one of the seven textbooks the > College Board is recommending for the new Advanced Placement Test > in Statistics. Even so, it is no secret that the book was not a > great success in the marketplace. The second edition attempts to > broaden the book's appeal. I think it succeeds, but sometimes at > the expense of muting a few of the virtues that made the first > edition so outstanding. > > One of those virtues is the writing. The first edition was by > far the most readable introductory statistics text I have ever > used. It was also written in a warm and friendly tone that > remains unusual in statistics textbooks. The second edition > maintains a high level of readability. The warmth is somewhat > diluted. > > Another thing that set the book apart was its content. Although > the NCTM Standards suggest big changes in how mathematics is > taught, and smaller changes in what mathematics is taught, the > underlying mathematics has not changed much. One and one still > is two, and has been for quite some time. Statistics, on the > other hand, underwent a great revolution in the 1960's, a > revolution often linked with the name of John Tukey. One of the > first things I look for in a statistics textbook is whether there > is any sign that the author has heard about this revolution yet. > Because so many statistics textbooks are written by > non-statisticians, the news has spread very slowly. Andrew > Siegel was part of the Tukey revolution, and I think that is one > reason why the first edition was ahead of its time. All the good > K-12 statistics materials from NCTM and QLP are definitely > post-Tukey, but many college textbooks still are not. Caveat emptor. > > One sign of Tukey's influence is the use of stem (and leaf) plots > and box (and whisker) plots. These are a necessary condition > for textbook adoption these days, but alas not a sufficient one. > A few of you may remember the "new math" era, when set ideas were > supposed to unify all of mathematics. We then saw textbooks that > sprouted an obligatory "Chapter 0" where set notation (not ideas) > was introduced, and then forgotten, and certainly never used to > unify the rest of the content. Similarly, we have reached the > point where most textbooks now mention the stem and leaf or > boxplot, but many really don't know what they are for, and so > never use them for anything. > > Another way people characterize the Tukey revolution is in terms > of the "three R's" of post-Tukey statistics > > residuals > reexpression > robustness > > Residuals are usually first encountered in the context of fitting > lines to data. There they are the (signed) distances between the > points and the fitted line. Analyzing them helps us to evaluate > how well our straight line model fits the data. Siegel and > Morgan introduce residuals very early -- the deviations from the > mean that figure in the computation of variance and standard > deviation are presented as residuals. Toward the end of the book > there is a masterful example of the use of residuals in > regression analysis. Data is presented on the average heights of > girls for ages 2-11. Height versus age looks like a nearly > perfect straight line, and the correlation is 0.997. Yet a graph > of the residuals shows pronounced curvature in the relationship, > something you would never see without examining the residuals! > (Although the book does not mention it, fitting a quadratic to > the data gives a residual plot that clearly indicates a cubic > component!) This is an example of one of the great strengths of > this book -- it not only shows you the latest techniques, it shows > them to you in examples that indicate what the technique does for > you and why it is important, rather than with examples that merely > show you the mechanics of carrying out the technique. Without > the "why", the "how" is useless. > > Reexpression is more often called "transformation". Perhaps the > most traditional example of that is the fact that some > relationships are better plotted on logarithmic or > semilogarithmic graph paper. The TI-82 calculator uses such > transformations (taking logs of x or y or both) to fit a variety > of models to two-variable data. The first edition of Siegel > contains the best elementary introduction to the use of > transformations in statistics. They are introduced early in the > book and used in both the analysis of variance and regression > chapters. The second edition contains the second-best elementary > introduction to the use of transformations in statistics. The > initial coverage is cut about in half, and the applications to > regression have disappeared. This is especially unfortunate for > use in the high schools, where the logarithmic and exponential > curve fitting features have found many uses in mathematics and > science classes, and raised a lot of questions and confusion > among teachers about what is going on there. In this instance, I > think Wiley has stepped backward too far. While the first > edition may have been (too far?) ahead of its time, much has > changed since 1988, and in this area the second edition is behind > the times -- though still ahead of most other textbooks! > > The third R, robustness, refers to the the ability of a statistical > measure or technique to resist the effects of errors and outliers > in the data, or violations of the assumptions underlying the > technique. The traditional mean and standard deviation are not > very robust to outliers, and so the more robust median and > interquartile range are preferred in many situations. (Note that > the boxplot is based on them.) Siegel and Morgan introduce these > robust measures first, and present them as the standard tools. > The mean and standard deviation are then introduced as > specialized tools particularly appropriate to normally > distributed data. This makes it clear that the Tukey revolution > really was a revolution -- it not only introduced additional > techniques, but changed the way statisticians regard the older > techniques. Siegel and Morgan understand this, but many other > textbook authors do not. > > The new edition extends the coverage of nonparametric techniques. > These are techniques that make fewer assumptions than the > traditional techniques, and generally handle outliers better. > They may be less efficient if your data really are drawn from a > normal distribution, but safer if they are not, or if you cannot > tell, as with small samples. > > One serious flaw in the first edition was the very small number > of problems for students. I would estimate that the new edition > has three to five times as many. There are answers to about half > of these, and the answers contain more words than numbers. The > words deal with interpretation of the data, which, after all, is > what statistics is all about. There is also an Instructor's > Solution Manual in the works with more detailed solutions. (I > put it on reserve in the library for student use.) > > Another criticism of the first edition was that it contained > hardly any formulas. Calculations were explained in a manner > resembling instructions for filling out your income tax forms. > Personally, I saw this as an asset. I teach a general education > statistics course to first and second year students at a small > former state teachers' college. For most of these students, > formulas would be a barrier to understanding rather than a path > to understanding. However, if you are a high school teacher > trying to show students the use of algebra in statistics, you > will want to see formulas. If you are a high school teacher > doing an AP Statistics course, you will want to keep your > students' algebra skills reasonably fresh for when they take the > SAT and go on to college. Indeed, the sample questions > distributed for AP Statistics require much more algebraic > facility than most of my college students have. For those who > like a little algebra in their statistics, the second edition of > this book is now bilingual. > > Indeed, the second edition is trilingual. The steps of carrying > out a procedure are given in words, in formulas, and in commands > for the Minitab statistical software. The computer examples do > not replace a manual for the software; often you see just the > final steps of an analysis, without any explanation of how they > set up the database or how they got to the last step. At least > the examples get you started and provide some experience in > interpreting computer printout in situations where no computer is > available. I think the choice of the Minitab software package is > a good one. There are versions of Minitab for DOS, Windows, > and the Macintosh. The software was originally designed for > educational purposes, and is probably the most widely used > software in college statistics courses, yet it is also used by a > majority of the Fortune Top 50 companies in the US. It was also > one of the first packages to reflect the Tukey revolution. A > disk containing most of the data sets from the book is promised. > The draft disk I examined had some bugs in it but there were > about 100 data sets, some of them definitely too large to ask > students to type in. > > There is no mention of calculators in either edition of the book. > That does not bother me, since a computer is a much more > appropriate tool for statistics, but it may bother some high > school teachers for whom graphing calculators are more familiar > and accessible to both themselves and their students. While we > all have to do the best we can with what we have, I hope prior > comfort levels with calculators will not divert teachers from > pressing for more appropriate technology. > > One of the limitations of calculators in statistics is their > limited data storage capacity. This book "recycles" many of its > data sets over and over, using them to illustrate a number of > different points. Sometimes a question raised in one chapter is > not fully answered until a later chapter when the same data is > examined again. I think this is a good technique, but I would > hate to have to constantly be retyping or reloading the data into > a calculator. > > My biggest disappointment with this text is that it does not do a > very good job of convincing the student that statistics is > important. There are many real data sets, and they are often > extremely well chosen to illustrate the techniques, but the > techniques are not often used to answer any real question of > interest. For example, the areas of important islands in the > Atlantic Ocean are used as an example of transforming data. It > is a wonderful example for that purpose. If you plot the data on > a linear scale you get Greenland at one end of the graph and a > big smudge including all the other islands at the other end. You > can not even get a legible graph without transforming this data. > However, no reason is ever given as to why we might want to study > the areas of these islands. We come away from the example > knowing more about statistics, but we do not know any more about > islands. This is sad, because statistics is primarily a tool to > answer real questions in areas outside of statistics. I should > make it clear that the present book is not outstandingly bad in > this regard. It is actually somewhat above average. However, it > is a failing of most textbooks that has come to bother me more > and more each year. You will need to supplement this one (and > most others) with some more motivating and realistic examples. > > One potential supplement would be _Statistics by Example_ by > Sincich (Dellen). This book contains a huge number of problems > based on real studies. Often the background is too sketchy or > too technical, and sometimes we get only summary statistics > rather than raw data, but there are so many problems that it is > still a worthwhile resource. (The book is pretty ordinary > otherwise, with only slight signs of Tukey-awareness.) > > My colleague Bill Roberts and I are currently half way through an > introductory statistics course using the Siegel and Morgan text, > and we are quite happy with it. I urge anyone looking for a > textbook to adopt to look at it. Those wanting to learn more > about statistics themselves might want to try to dig up a copy of > the first edition. > > > Reviewed by Robert Hayden > Plymouth State College > Plymouth, New Hampshire > > ********************************************************* >
--
_ | | Robert W. Hayden | | Department of Mathematics / | Plymouth State College | | Plymouth, New Hampshire 03264 USA | * | Rural Route 1, Box 10 / | Ashland, NH 03217-9702 | ) (603) 968-9914 (home) L_____/ hayden@oz.plymouth.edu fax (603) 535-2943 (work)
|
|
|
|