Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.



Re: Principal Component Analysis Alternatives for low sample to dimensions ratio
Posted:
Apr 11, 2013 12:48 PM


"Krevin" wrote in message news:910037f766874714b75b4a02268f1b8e@googlegroups.com...
Anyone know of good alternative methods to PCA when you have too many dimensions compared to samples?
If I have 2000 variables and 300 samples, I cannot properly use PCA.
I'm looking for something that can minimize false positive separation of sample points without needing to reduce my number of variables.
Thanks, Krevin
=================================================================================
There are many possibilities, ranging between: (1) A version of PCA in which you use a fictitious covariance matrix, not estimated from the data but guessed from experience; a version of this might estimate part of the covariance matrix with the rest filled in by assuming zero correlations or partial correlations. (2) A version of cluster analysis in which you define distances in the variable space on the basis of relative importance on an intuitive scale; a version of this might just use a weighted sum of squares with weights derived from the sample variances, adjusted for any perceived overlaps in meaning. But the idea here would be to have a good vision of "importance" of the variables, with the sample statistics being not really relevant to the clustering.
David Jones



