General Comments on Standard DeviationDate: 01/15/2004 at 01:23:40 From: Mike Gray Subject: Understanding Standard Deviation I have a basic understanding of the Standard Deviation and what it's used for, but there are some things about it that I do not quite grasp. 1. When someone says, "Compute the Standard Deviation of a data set", is that the same as asking to compute ONE Standard Deviation? If they wanted to know the range of data that falls within TWO SD's, how would they properly ask that question? 2. When we speak of "Standard Deviation", does it only apply to a Normal Distribution curve, or can SD also apply to other distribution curves? 3. When someone asks to compute the SD of a data set, how would a person know which type of distribution curve to use for the calculation if the distribution curve is not known? 4. What is the history behind the SD? In other words, who invented it and why? I own a business that repairs X-ray equipment. During testing, I record various measured parameters of the repaired equipment. After I record a number of values over time, I calculate the SD to determine if my measuring methods are precise or "sloppy". If I had a better understanding of SD I may decide that I'm wasting my time calculating it if the numbers have little meaning. I hope this makes sense! Thanks! Date: 01/15/2004 at 13:29:55 From: Doctor Douglas Subject: Re: Understanding Standard Deviation Hi Mike. Thanks for writing to the Math Forum. Your question is refreshing, because it seeks understanding, even when "calculating the math is easy". Very few of our submitted questions try to do this. 1. Yes, "computing THE standard deviation" means "computing one standard deviation". Implicitly, this means that when you are done, you also know the value of 2 standard deviations (e.g. for 95.5% confidence interval problems), or 3 SD's, or 2.5 SD's, and so on. 2. You can compute a standard deviation for almost any distribution as the square root of the variance, where the variance is defined in terms of an integral (for continuous distributions such as the normal or gaussian distribution, and the uniform distribution), and in terms of a sum (for discrete distributions such as the binomial and poisson distributions). I say "almost" because there are some distributions for which these integrals or sums do not converge, even though the distributions do describe probabilities in certain situations. 3. For a data set {x1,x2,...,xN}, the distribution is (usually) unknown, but you can still compute its standard deviation: SD = sqrt{[(x1-u)^2 + (x2-u)^2 + ... + (xN-u)^2]/N} where N is the number of sample values and u is their mean. This computation does not require a priori knowledge of the distribution. In fact, it may be used to INFER the shape of the distribution, although additional information is needed. For example, suppose the values {x1,...,xN} measure the angular position of the sun in degrees (and for argument's sake let us imagine that the sun moves in a circular orbit at constant speed around the earth, and that it passes directly overhead at noon). You can imagine two situations: A. measurements are taken at "noon" by many different people, each of which defines "noon" according to his or her own wristwatch. The sample values {x} will probably be *normally* distributed around a mean of zero degrees (i.e. overhead), and the standard deviation estimate of the typical amount of time by which wristwatches are unsynchronized. If you construct a histogram of the measured angles, it should approximate the bell-shaped curve of a normal distribution. B. measurements are taken anytime the sun is up. Now, the sample values will probably be *uniformly* distributed between -90 degrees (sunrise) and +90 degrees (sunset), with no particular time being favored over another--even though zero degrees is certainly still the mean of this distribution. In this case the standard deviation is related to the (angular) length of the day, and has very little to do with wristwatches [in fact the standard deviation extracted from the data set will likely have the value near [(180 degrees)/sqrt(12)]. The histogram of the sample angles in this situation will be flat from -90 to +90 degrees. In both of these cases, one can mechanically compute the standard deviation of the data. Interpreting what this value means, though, is more delicate. This is relevant to your equipment repair, and your measurements may indicate, for example, that your measuring methods are too coarse or just right, or perhaps indicate what particular components of the equipment are problematic. 4. As for the history of the standard deviation and its usage, the following webpage Earliest Known Usage of Some of the Words of Mathematics (S) http://jeff560.tripod.com/s.html says that the word was "introduced by Karl Pearson (1857-1936) in 1893 'although the idea was by then nearly a century old'". A bibliography on this topic is at History of Mathematics: History of Probability and Statistics http://aleph0.clarku.edu/~djoyce/mathhist/statistics.html I hope this helps answer your questions. Feel free to write back if you need more information. - Doctor Douglas, The Math Forum http://mathforum.org/dr.math/ Date: 01/15/2004 at 13:37:48 From: Mike Gray Subject: Thank you (Understanding Standard Deviation) Thank you very much for your quick and enlightening answers to my questions! I appreciate it very much. You pick up where the text books leave off! My best regards, Mike |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]
Ask Dr. Math^{TM}
© 1994- The Math Forum at NCTM. All rights reserved.
http://mathforum.org/dr.math/