Standard Deviation for Variance?
Date: 01/25/2001 at 06:00:15 From: Dr Peter Lobmayer Subject: Standard deviation for variance - does it exist ? I calculate income inequality from a survey sample. One of the measures I use is the variance of the logarithm of individual income in different geographical areas. Sample size varies from 100 to 1600 in different areas. I would like to calculate a measure of the reliability of my data. The best would be the standard deviation of variance, but I could not find such a term in my reference books. Does such measure exist ? If so, how to calculate ? With best regards, Peter Lobmayer.
Date: 01/25/2001 at 06:35:47 From: Doctor Mitteldorf Subject: Re: Standard deviation for variance - does it exist ? Dear Peter, Here's a somewhat personal view, but you might get a different one from another statistician, and I encourage you to do so. Don't think in terms of formulas and doing the one right thing with your data. There are lots of formulas, but there are no hard-and-fast rules telling you the right one to use in a given circumstance. The art of the statistician is to create a mathematical model that (1) applies to the question at hand, and (2) answers the exact question to which you're seeking a solution. Formulating that question precisely is the crux of your art. When you find yourself asking for the "best" measure, or even asking "does this measure exist?" you're straying from the notion of mathematical modeling, and seeking to justify your work via some "higher authority." But there is no higher authority. Every statistical problem is unique, and you must stand on the cogency of your own reasoning every time you present a statistical argument in a scientific journal. So much for the sermon. What's to be done in your situation? My bias here leads me to the practical rather than the theoretical. I offer a prescription that is transparently fair and relevant, but which is not a textbook formula: For each of your samples of size n, randomly delete sqrt(n) data points. (I suggest sqrt(n) because any sample of size n is associated with a statistical fluctuation on the scale sqrt(n)). Now recalculate the variance of the log of incomes as you did before. Repeat this entire process 10,000 times, each time ignoring a different random subset of sqrt(n) data points for each of the areas in your sample. Record all 10,000 answers, and calculate their mean and standard deviation. The mean should be very close to your original calculation; the standard deviation is a very fair measure of the reliability of your final answer. This kind of thinking is called "Monte Carlo simulation" and was invented around the time of the first computer. It uses a lot of computer power, but computer power is free for most of us these days. It requires some programming, whereas many statistical software packages don't. I like Monte Carlo simulation because it can apply exactly and specifically to your data and your situation in a way that a textbook statistical test rarely can. - Doctor Mitteldorf, The Math Forum http://mathforum.org/dr.math/
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.