Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: Mahalanobis_distance and Gaussian distribution
Replies: 6   Last Post: Jan 18, 2013 12:10 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
David Jones

Posts: 60
Registered: 2/9/12
Re: Mahalanobis_distance and Gaussian distribution
Posted: Jan 13, 2013 3:03 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply



"MBALOVER" wrote in message
news:5f647925-32c3-4eea-81d0-373dce065232@googlegroups.com...

>from Wiki, http://en.wikipedia.org/wiki/Mahalanobis_distance
>Maha distance is to measure the probability if a point belongs to a
>distribution.



>1.Do we have to assume that that distribution is Gaussian to have Maha
>distance meaningful?


There are some results for the distribution of the statistic that do rely on
the assumption hat the initial distribution is Gaussian . Assuming that you
mean the case where the mean and covariance matrix are assumed known, the is
result that statistic has a chi-squared distribution ... which does rely on
the Gaussian assumption. But there is a related result that does not rely on
the Gaussian assumption... specifically the mean value for the statistic is
known ... but this does rely on having use the right covariance matrix. The
variance of the statistic requires rather information but can be evaluated
theoretically from the first four joint moments of the initial distribution.
This is probably too complicated for practical use. There are several
possibilities.
(i) use probabilities derived from the chi-squared result, but don't treat
them as anything more than a rough guide
(ii) create a standardised statistic by subtracting off the mean and
dividing by the standard deviation, both derived from the chi-squared
result ... this at least would give something more easily related to the
underlying data and not be so dependent on he Gaussian distribution in
deriving ficticious/incorrect probabilities.
(iii) use some sort of resampling technique or simulations to get abetter
grip on the properties of the distribution .

Of course if you re using a version of the statistic where the mean and
variance of the initial distribution have to be estimated from data, the
situation is more complicated.

The Mahalanobis distance is a multivariate version of judging the distance
of a point from a univariate distribution by scaling the distance from the
mean by the population standard deviation, and is applicable to any
distribution for which these moments exist. Of course, there may always be
something better ... this would depend both on the distribution being used
as the initial distribution and on the sort of departures from this
distribution that are important in any particular context. There are several
difficulties in defining general measures of how far a point is from a
distribution, not least because of the potential effects of even simple 1-1
transformations of a multivariate space (and the Mahalanobis distance isn't
immune from these difficulties, but at least one can look for a
transformation yielding something close to a multivariate Gaussian
distribution).


>2. I have two distributions in different coordinate spaces. Let's Space A
>which has 3D, and Space B which has 2D. I have two points P1 with
>coordinates [x, y, z] in Space A and P2 with >coordinates [ u, v ] in Space
>B. I wonder if I can apply MH distance to compare which one ( either P1 or
>P2) is closer to its corresponding distributions. Does comparison make
>sense?
>Do I have to do anything to normalize between two distributions>


Would you be prepared to compare the simple shifted and scaled (ie using
mean and standard deviation) versions of measurements from two different
univariate ditributions? The answer isn't obviously yes. If you look at
properties of the Mahalanobis distance you will see that these do depend on
the number of dimensions involved in the initial distributions, but you
might choose to proceed either by converting to chi-squared exceedence
probabilities (or equivalent), or by standardising by the mean and standard
deviation of the Mahalanobis distance




Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.