I wholehearted agree with the conclusion that a single standardized test can't be used to evaluate the quality of teaching. I strongly disagree with the implicit conclusion that standardized tests should not be used as indicators of where to look for causal explanations.
Popham fails to address using ensemble average trends over several years as indicators of teaching quality. If at a particular school the average ERB mathematics score dropped by one standard deviation per year relative to an adjacent school, it certainly doesn't PROVE anything about the quality of teaching. It might be the case that the apparent poor performing school was in fact addressing a very current and very useful branch of mathematics not even remotely considered for inclusion on a mere standardized test. It might be the case that the high scoring school cheated or taught a very narrow curriculum directed specifically at the test. Hopefully a number of people would be strongly motivated to investigate what was going on in some detail.
Popham seem to believe that the inclusion of items on standardized tests that are not part of standard texts is inappropriate. Popham cites the study: Freeman, D. J., Kuhs, T. M., Porter, A. C., Floden, R. E., Schmidt, W. H., & Schwille, J. R. (1983). Do textbooks and tests define a natural curriculum in elementary school mathematics? Elementary School Journal, 83(5), 501-513. "The proportion of topics presented on a standardized test that received more than cursory treatment in each textbook was never higher than 50 percent" (p. 509). That is entirely consistent with my educational philosophy: Don't teach to the test, teach beyond it. Success at that task brings a high likelihood of 90th percentile scores on the tests. But so what? It is far more important to have acquired some useful skills. Real life is enormously more complicated than a standardized test. Its OK to have high test scores, but substantial skills far beyond their narrow content is desirable. The standardized test knowledge areas are like (but aren't) subsets of the skills people use in modern society.
The author fails to address using control populations under the heading "Confounded Causation ". I used "find" on the original text and the isolated word/phrase "control" and "comparison population" aren't in the article. California makes a crude adjustment to the STAR test by taking into to account a SES index and reporting raw rank as well as adjusted rank for individual schools. If at a particular school, all of the students of a particular teacher scored 30 percentile points lower than another teacher with a nominally similar group of students, wouldn't you be a bit suspicious? Wouldn't you want to find out if fumes from a lead smelter were preferentially precipitating in the east room? Or should you use Popham's conclusion that standardized test scores should not be used for comparison purposes and ignore the data? If in a particular district, a particular school scored two standard deviations below eight other schools with similar SES compositions, isn't further investigation warranted? If test score averages are contour plotted and the outlines of particular states are recognizable, should that information be ignored? Or should it be considered in conjunction with other data to formulate a plan of action?
In the same section, Confounded Causation the following three statements are made: "Few parents spend much time teaching their children about the intricacies of algebra or how to prove a theorem. " "The most troubling items on standardized achievement tests assess what students have learned outside of school." "One of these factors was directly linked to educational quality. But two factors weren't." It seems to me that if a parent has spent substantial time with their child on homework, it is likely that the child will have received a better education. Part of the success of PS 169 in NYC is that they have an explicit program to involve parents in the education of their own children. I believe that any school that fails to even attempt such involvement SHOULD be judged harshly for that failure whether it shows up in test results or not. Popham has subtly substituted "school based educational quality" for "overall educational quality" and ignored the question of whether the should be links between them.
Finally, I do agree with Popham's overall conclusions: "I suggest a three-pronged attack on the problem. First, I think that you need to learn more about the viscera of standardized achievement tests. Second, I think that you need to carry out an effective educational campaign so that your educational colleagues, parents of children in school, and educational policymakers understand what the evaluative shortcomings of standardized achievement tests really are. Finally, I think that you need to arrange a more appropriate form of assessment-based evidence. " But I don't think they go quite far enough. The POSSIBILITY that the real problem is simply an inferior education is discounted. I would also add that the same groups should be educated about the analysis of trends and the power of controlling for other variables rather than just looking at raw scores.