Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.matlab

Topic: Determine relative importance of original variables after performing PCA
Replies: 6   Last Post: Jan 28, 2013 6:48 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Greg Heath

Posts: 5,944
Registered: 12/7/04
Re: Determine relative importance of original variables after performing PCA
Posted: Jan 25, 2013 3:05 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

"Maureen " <maureen_510@hotmail.com> wrote in message <kdrm1o$e57$1@newscl01ah.mathworks.com>...
> I have 350 observation and 27 variables. So I want to use PCA for dimension reduction purpose to plot the 350 observation on a 2D plot, which effectively means that I will only be using PC1 and PC2. My purpose is just to see their relationship on a 2D plot.
>
> But how do I determine which of my original variables contribute most to the first two principle components and which of the variables are less important in which I can discard? I have saw many similar post online but have not come up with a solution. Where should I go from here?


You have not indicated

1. whether the task is classification or regression
2. if any of the 27 are ouputs
3. the number of output variables

The most important is 1 because PCA is inapprorpriate for classification. Therefore, I'll
assume the task is regression.

The next most important is 2 because PCA is only used to transform the input space.
Therefore I'll assume 27 original input variables.

3 is still important becase it affects what algorithms/techniques should be used.

> I have read through the documentation on feature selection, and some people >suggested using stepwisefit and other regression methods.

Yes. The best criterion to use is one that optimizes a specific function of the output variables.

>I do not have much background with regression, so do correct me if I am wrong. Based on my readings, I believe I would need to have a set of criteria to select the features, in >which I do not have an idea what should the criteria be.

If it's regression, it is simple, just read the STEPWISEFIT documentation.

If it is classification, then you should not be using PCA because there is no reason why
PCA space should be preferred over the original.

> Also there should be a set of output, Y in order to perform stepwisefit. But for my case, all 27 variables are my features, which is the input so to speak and I do not have a set of output.
>
> So if not using regression, may I know where do I go from here, so that I can determine the importance of my original set of variables? In other words, I need to find the contribution of the original variables to PC1 and PC2.
>
> Appreciate any help/ suggestion. Thanks in advance!


If you don't know what you want to optimize, then there is no reason to use PCA
over the original variables.

What do you want to do with the data?? What is your ultimate goal.

Greg

P.S. I want to to carpentry with two tools. Which 2 should I use?



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.