Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: SVD for PCA: The right most rotation matrix
Replies: 22   Last Post: Jan 4, 2013 4:19 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ] Topics: [ Previous | Next ]
Gary

Posts: 73
Registered: 9/6/07
Re: SVD for PCA: The right most rotation matrix
Posted: Jan 4, 2013 4:19 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Monday, 29 October 2012 02:28:37 UTC+2, Paul wrote:
> My apologies if this appears twice. The posting of this message seems
>
> to have been held up.
>
>
>
> I am trying to understand SVD in the context of PCA. I have looked at
>
> Leskovec (http://search.yahoo.com/r/
>
> _ylt=A0oG7t31r41QSHsAFA9XNyoA;_ylu=X3oDMTE0YmlrMDI5BHNlYwNzcgRwb3MDMQRjb2xvA2FjMgR2dGlkA01BUDAwNl83MQ--/
>
> SIG=13fl10gvd/EXP=1351491701/**http%3a//www.cs.cmu.edu/~guestrin/Class/
>
> 10701-S06/Handouts/recitations/recitation-pca_svd.ppt) and Shlen
>
> (http://search.yahoo.com/r/
>
> _ylt=A0oG7t0dsI1Qj3oAG1ZXNyoA;_ylu=X3oDMTE0YmlrMDI5BHNlYwNzcgRwb3MDMQRjb2xvA2FjMgR2dGlkA01BUDAwNl83MQ--/
>
> SIG=11r2sjgrs/EXP=1351491741/**http%3a//www.snl.salk.edu/~shlens/
>
> pca.pdf) for intution.
>
>
>
> The scenario I use is a lab experiment in which m sensors
>
> syncrhonously sample data at n points in time, yielding a data matrix
>
> X with m rows and n columns. Each row contains the readings from a
>
> single sensor/instrument, and each column contains the readings from
>
> an instant in time. I suppose that the rows could also be key words
>
> in a data mining exercise, and the columns could be documents in which
>
> we try to find these key words in (as per Leskovec above), but that
>
> scenario is a bit foggier for me because it deals with "concepts", the
>
> number of which matches neither m nor n. So as a first step, stick
>
> with the scenario for lab sensor/instrument. Also, consider only real
>
> data, so the data covariance matrices are diagonalizable with
>
> orthonormal eigenvectors corresponding to simple rotations of the data
>
> in m-space.
>
>
>
> http://en.wikipedia.org/wiki/Principal_component_analysis#Details
>
> diagonalizes the data set X by factoring it into X=(W)(Sigma)(Vt)
>
> where:
>
>
>
> * For W, the columns of this (m)x(m) matrix are the orthonormal
>
> eigenvectors of covariance matrix (X)(Xt) {Xt is the transpose of X}.
>
>
>
> * Specifically, (X)(Xt) contain the covariances from pairing the m
>
> sensors/instruments rather than from pairing the n samples of m
>
> measurements. The former is of interest to us while for the life of
>
> me, I can't see the relevance of the latter.
>
>
>
> * Xt = Is the transpose of X.
>
>
>
> * Vt is the transpose of (n)x(n) matrix V. The columns of V are the
>
> orthonormal eigenvectors of the covariance matrix (Xt)(X) --
>
> specifically, the covariances from pairing the n samples of m
>
> measurements. This relevance of this matrix is what I can't see the
>
> relevance of (intuitively).
>
>
>
> * Sigma is the diagonal matrix of square roots of eigenvalues of (X)
>
> (Xt), which are the same as for (Xt)(X).
>
>
>
> I am trying to eek out some intuition from X=(W)(Sigma)(Vt). I find
>
> it curious and interesting that the covariances (X)(Xt) are viewed as
>
> a linear transformation, and the eigenvectors in W become the
>
> orthogonal directions in which the scalings differ. Hence, they form
>
> the basis vectors that are aligned with the principal components.
>
> Then it becomes obvious that Sigma is simply the anisotropic axial
>
> scaling.
>
>
>
> If X is viewed as some kind of linear tranformation (and I'm not sure
>
> if I'm actully suppose to do that), than Vt can be seen as a rotation
>
> so that the princpal component aligns with the 1st axis, the 2nd
>
> principal component aligns with the 2nd, etc., prior to the scaling by
>
> Sigma. Finally, I would expect W to rotate the data back to its
>
> original orientation, thus yielding X on the LHS.
>
>
>
> Following Shlen's tutorial, I find the above picture is easier to see
>
> if we rewrite the SVD formula as (Wt)(X)=(Sigma)(Vt), where the /rows/
>
> of Wt are the eigenvectors of covariance (X)(Xt) between sensors/
>
> instruments. Treating them as basis vectors, then multiplying them by
>
> the columns of X simply projects the m-value samples from each
>
> measurement instance onto the principle components, which yields the
>
> rotation of the data points so that the principle components align
>
> with the axes. Conversely, X=(W)[(Sigma)(Vt)] takes the data points
>
> in the rotated state (principle components aligned with axes) and
>
> unrotates themm so that it matches the orientation of the measured
>
> data points.
>
>
>
> One of the most disturbing things I haven't been able to figure out is
>
> what V (or Vt) corresponds to in the real world. I mean, if X was a
>
> transformation, then Vt is simply a rotation in n-space. But X
>
> *isn't* a transformation. And n-space is meaningless because we would
>
> never treat the vector of data from a single sensor as a data point
>
> (i.e., each measurement instance in time as a dimension) and plot it
>
> in n-dimensional space. So even though V or Vt somehow corresponds to
>
> a geometric rotation of sorts, it's in an space that is nonsensical
>
> and has no bearing in the real world.
>
>
>
> I realize that Leskovec describes SVD differently, as documents versus
>
> search terms, with concepts as an intermediate thing that is
>
> determined by the SVD. The left and right singular vectors then
>
> represent the correlation of documents versus concepts and search
>
> terms versus concepts. However, he doesn't really delve into why the
>
> math corresponds to that. Also, I'm much more interested in the lab
>
> sensor/instrument scenario, where the size of the diagonal matrix
>
> corresponds to the size of the data set (at least before dimensional
>
> reduction).
>
>
>
> So when I look at the mockingly simple SVD formula, I have developed a
>
> phobia of the mysterious rotation matrix at the tail end. It has
>
> defied my endless attempts (no joke) to try to understand
>
> intuitively. Thank you anyone for imparting some clear intution to
>
> this.


You have been given a lot of references but I didn't see the one below so I will mention it here:

Stanley Mulaik "Foundations of Factor Analysis" (second edition). Chapman & Hall/CRC. 9Taylor Francis Group). publication date: 2010.

Lance


Date Subject Author
10/28/12
Read SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Ray Koopman
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Ray Koopman
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Art Kendall
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Art Kendall
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Art Kendall
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Art Kendall
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/30/12
Read Re: SVD for PCA: The right most rotation matrix
Art Kendall
11/1/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Richard Ulrich
10/29/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
11/1/12
Read Re: SVD for PCA: The right most rotation matrix
Gottfried Helms
11/1/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
11/2/12
Read Re: SVD for PCA: The right most rotation matrix
Gottfried Helms
11/4/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
11/4/12
Read Re: SVD for PCA: The right most rotation matrix
Gottfried Helms
11/6/12
Read Re: SVD for PCA: The right most rotation matrix
Paul
1/4/13
Read Re: SVD for PCA: The right most rotation matrix
Gary

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.