```Date: Jan 17, 2013 2:02 PM
Author: Richard Ulrich
Subject: Re: Mahalanobis_distance and Gaussian distribution

On Thu, 17 Jan 2013 11:21:24 -0000, "David Jones"<dajhawk@hotmail.co.uk> wrote:[snip, a bunch]>>The Mahalanobis distances may be dimensionless with respect to the units of >the underlying observations but that does not men that they are immediately >comparable across different sources of data. Even of the number of >dimensions is the same you still need to look at context. For example, if >used in some formal testing procedure, the power of such tests can be >different. Consider two different set of observations on the underlying >quantity, one with rather more random observation error than the other.>>For different dimensions, consider the case where the dimensions are much >more different, say 2 and 100. Then a typical value of  Mahalanobis distance >for a point from the second population would be 100, but this would be very >unusual value for a point from the first population. In fact the sets of >values of distances for the two populations would hardly overlap. If this is >meaningful for whatever way you intend to use the distances then OK. But >many uses are of the kind where you are looking for datapoints that are >unusual with respect to an initial distribution ... the Mahalanobis distance >is not (without some transformation) directly usable in a comparison between >sets of data with different dimensions, as exemplified in the case above >where a value of 100 is unusual for one population but not the other.David, I'm asking myself -- to judge which is more of an outlier, Why can't we consider the "p-value" of each of these two chisquared distributions with different df's?I'm not saying that this is a good idea.  --  I *suspect*  that thereis something shaky about it, or I might have heard of it being done before, and it doesn't seem familiar.  Or, is that just because the circumstances are too rare in my reading?Wikip tells me that the M distance was first used in anthropology,for categorizing new skulls.  They should have "missing" to account for, at times when some measurements aren't avialable, which would create the same circumstance.  I wonder what they do. -- Rich Ulrich
