
Re: How to combine the standard deviations of multiple data subsets
Posted:
Oct 7, 1999 6:52 AM


Hi
On Tue, 5 Oct 1999, Henry wrote: > On Mon, 04 Oct 1999 21:35:27 GMT, se16@btinternet.com (Henry) wrote: > >On Mon, 4 Oct 1999 14:29:17 0600, "Steve Schnick" > ><sschnick@yahoo.com> wrote: > >>If one has several subsets of a given data set, and the mean, count, > >>and standard deviations for each of these subsets, how can one > >>calculate the combined standard deviation of the data subsets? i.e., > >>if the subsets were lumped together into one set, how does one > >>calculate this new standard deviation? > He did and my reply was: > "Suppose data is Mi (the means) Ci (the counts) and Vi (the variances) > The overall count is C = sumof(Ci) > The overall mean is clearly M = sumof(Mi.Ci) / C > The overall variance is V = sumof(Ci.(MiM)^2) / C + sumof(Ci.Vi) / C > or equivalently V = sumof(Ci.[Vi+Mi^2]) / C  M^2 > No guarantee on this, but I think it is right."
I have no reason to think the above is incorrect, but I would have conceptualized this problem using ANOVA approach.
V = SStotal/(N1) = (SSwithin+SSbetween)/(N1)
SSwithin = sumof(Ni.Vi) SSbetween = sumof(Ni.(MiM)^2)
Best wishes Jim
============================================================================ James M. Clark (204) 7869757 Department of Psychology (204) 7744134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 clark@uwinnipeg.ca CANADA http://www.uwinnipeg.ca/~clark ============================================================================

