J
3
3
4/10/13
4/10/13


gmdistribution.fit oddity
Posted:
Apr 8, 2013 8:39 PM


Statistics Toolbox 8.1 with 2012b MATLAB
I recall using gmdistribution.fit on different computers in the past with no problems, but on the current computer I am seeing some odd behavior:
1. gmdistribution.fit seems to always want to overfit based on both AIC and BIC. I could see AIC tending to want to overfit, but BIC? For example, I take some ridiculously downsampled data so that there are only 10 samples and, simply by "eyeballing it", you could never tell it came from more than a 2component mixture at best. gmdistribution.fit spits back an 8 component GMM as the minimum BIC solution though!? I swear in the past I didn't have this problem, and it would spit back a 1 or 2 component fit as the minimum BIC solution (i.e., fit multiple models and select the on with the minimum BIC).
2. Regardless of whether I specify CovType as 'full', or allow it to supposedly default to this, all the GMM components it fits are diagonal. Clearly my data should not have a best fit that is diagonal.
I can use the very simple gmdistribution.fit example in the docs, downsample "X" to 5 samples, and do the same thing; i.e., fit 1, 2, and 3 component GMMs and it always wants to select the highest order, even if it's overfitting, and the obj.Sigmas are always diagonal, no matter whether I select CovType 'full' or not, and no matter which 5 samples I downsample to, regardless of how much the "eyeball" test does or doesn't tell you that the covariance matrices should not be diagonal....
...any idea what could be going on here???
Again, I swear I've used gmdistribution.fit in the past and have NOT observed any of this, but it was on a different computer with 2010b and Statistics Toolbox 7.4.


