Paul
Posts:
517
Registered:
2/23/10


Re: SVD for PCA: The right most rotation matrix
Posted:
Nov 1, 2012 3:53 PM


I might be missing some linear algebra theory here, but I looked up gettrans() and I'm not sure what is meant by a column rotation in that context.
Please see below for further comments.
On Nov 1, 1:21 pm, Gottfried Helms <he...@unikassel.de> wrote: > Hi Paul, > > if I understood you correctly, you setup the SVD on your Xdata such > that (let Z denote Sigma and W and Vt the unique rotations which > cause that Z becomes diagonal) > > Z = W * X * Vt > > The Xdata contain n samples along the columns taken with m sensors > defining the rows (I assume, they are centered, and for the example > below that they are also standardized) > > Then I understood your question that you ask, what relevance has Vt > and especially in terms of a multidimensional euclidean model. > > If I understand you correctly so far, then the following might be > helpful concerning Vt. > > The key is, that the n samples define m vectors in an ndimensional > euclidean space; simply each column of X can be seen as a spatial > dimension. In that ndimensional space there are m vectors, where > the number m is smaller than n. Any rotation in that space > repositions the vectors, but *not* the relation, or better: the > angles, between them
I'm not sure why *any* rotation in nspace would not preserve angles. I thought that a rotation is by definition a unitary transformation (from a recent brushup on linear algebra at Wikipedia e.g. http://en.wikipedia.org/wiki/Orthogonal_matrix).
> ...So we can rotate the vector model X > (columnwise) first such, that > sensor 1 defines the xaxes, > sensor 2 and 1 define the xyplane > sensor 3 to 1 define the xyzspace > and so on.
I don't quite follow what you mean by "rotat[ing] the vector [model] X columnwise". If you interpret each column of X as a point (or vector) in nspace, we get what you describe (sensor 1 is the xaxis, sensor 2 is the yaxis, etc.). However, a rotation is not needed for this.
> In effect, that rotation provides a matrix X1 which is triangular > with as many nonzerocolumns as the rank of the matrix is (and we > assume for simplicityness, that it equals m)
I think I'm missing something fundamental...the data matrix is not triangular, though the (n)x(n) covariance matrix (Xt)(X) is symmetric.
> Then the matrix X1 can be rotated to the position of their principal > components (we're talking already of the nonzero columns only), > let's call this X2
I see that the data must be rotated so that the principal axes align with the axes of mspace (not nspace), and then the diagonal matrix Sigma performs the anisotropic axial stretching.
> That two rotations together form your matrix Vt. After that, X2 can > be rotated by rotation of its rows to diagonal form  this is your > rotationmatrix W, which rotates for the principal components with > respect of the rows in X2 (and which is the same as the rotation > with respect of the rows in X).
But W is not applied after Vt, Sigma is (the anisotropic axial scaling). After that, however, I see that W does rotate the axially scaled body of data points so that maximally stretched axis becomes aligned with the principal component of the measured data, the 2nd most stretched axis aligns with the 2nd principal component, etc.
So the rotation by W is very intuitive to me, while the rotation by Vt is not. And as I described, it's all the more mysterious when you consider that X isn't actually a transformation that is applied to data  it *is* the data. For this reason, I find it difficult to see the decomposition of X as a series of transformations (rotate, stretch, rotate) despite the intuitive appeal of the (W)(Sigma) (there is no intuition on my part concerning Vt).
This inability to picture (X)(Sigma)(Vt) as a transformation shows up particularly in my lack of intuition concerning Vt...it is the first of the 3 decomposed transformations that gets applied to any data point/vector that is subjected to the 3step stransformation. The question is "What is this data that gets subjected to this transformation? And in nspace, no less". It seems that the 3step transformation and the data are the same!
Furthermore, when I am seeking correlation between the m sensors, it confounds me to think about why one would picture the data points in n space. As an analogy, if I am doing simple linear regression on a cloud of 1000 points in the xy plane, I don't try to picture the data points in 1000dimension space.
> I've done this stepbystep with my matrixcalculator MatMate and > show the matrices where we have only (m=)3 sensors and (n=)6 > samples. > > We generate a random dataset for 3 sensors, and 6 samples, centered > and standardized normal distributed data in matrix X > [22] set randomstart=41 > [23] X =zvaluezl(abwzl( randomn(3,6))) > X : > 0.2096 0.2276 2.0787 0.1263 0.9337 0.8340 > 0.0668 0.0848 0.8837 1.6367 0.7827 1.3842 > 0.4677 0.4803 0.8500 0.2395 0.9091 1.9860 > > Each row defines the coordinates of one vector in the n=6 > dimensional space. Now wet get the rotationmatrix t1, which > rotates X to triangular form, preserving the angles (=cosines, > correlations) between the vectors: > [24] t1 = gettrans(X,"Drei") > t1 : > 0.0856 0.0449 0.3898 0.6802 0.4701 0.3937 > 0.0929 0.0538 0.1865 0.1958 0.3348 0.8963 > 0.8486 0.1986 0.1513 0.2615 0.3856 0.0206 > 0.0516 0.6916 0.5339 0.4630 0.1392 0.0151 > 0.3812 0.2498 0.5843 0.1452 0.6459 0.1125 > 0.3405 0.6441 0.4049 0.4418 0.2858 0.1685
Sorry, I tried to google gettrans, but wasn't able to find much beyond the fact that it is a column rotation. It's not clear to me what is meant by that. Consequently, I wasn't able to follow the rest of the example.
However, I appreciate the time that you took to compose the attempted explanation.

