Paul
Posts:
208
Registered:
2/23/10
|
|
Re: SVD for PCA: The right most rotation matrix
Posted:
Nov 4, 2012 2:04 PM
|
|
Hi, Gottfried,
Thanks again for taking the time to write such a detailed response. I gave it a quice twice-over, but it needs more than that. I'll get back to it soon, but I got some preliminary comments.
On Nov 2, 2:02 am, Gottfried Helms <he...@uni-kassel.de> wrote: > it seems I made my comment more complicated than the procedure is. > > Am 01.11.2012 20:53 schrieb Paul: > >> I might be missing some linear algebra theory here, but I looked up >> gettrans() and I'm not sure what is meant by a column rotation in >> that context. > > No, gettrans is just a function-call in my MatMate-script-language, > which returns a rotation-matrix. For instance, by the command: > > t1 = gettrans(X,"drei") // "drei" means "triangular" > > t1 becomes the rotation-matrix, which is required to rotate > columnwise...
I am still not sure what is a columnwise rotation. Do you actually switch columns around, or is it more like a geometric rotation?
> ...X to triangular shape. After that we can do the following > with t1: > > Y = X * t1 > // Y is a lower triangular matrix, (with possibly empty columns > // to the right > Z = Y * t1' > // Z equals now X, because t1*t1' = I (Identitymatrix) > > or, for doing roation to principals components position: > > t2 = gettrans(X,"pc") // "pc" means "principal components" > > and then > > B = X * t2 > // the columns of B are now orthogonal, are the principal > // components > > I've introduced that function "gettrans" additionally to the simple > "rotate"- function to have the rotation-matrix available for later > manipulation, or to be able to reverse a rotation later or to apply > the same rotation to another matrix etc. It can also be made to work > only on certain columns and using only certain rows for the > criterion; This is then useful, if one uses rotations, which are > implemented as iterative procedures like "pc" or "varimax" or > similar. > >>> The key is, that the n samples define m vectors in an >>> n-dimensional euclidean space; simply each column of X can be seen >>> as a spatial dimension. In that n-dimensional space there are m >>> vectors, where the number m is smaller than n. Any rotation in >>> that space repositions the vectors, but *not* the relation, or >>> better: the angles, between them >> >> I'm not sure why *any* rotation in n-space would not preserve >> angles. I thought that a rotation is by definition a unitary >> transformation (from a recent brush-up on linear algebra at >> Wikipedia e.g. >>http://en.wikipedia.org/wiki/Orthogonal_matrix). > > My remark may be obfuscating here. There is the concept of "oblique > rotations" in factor analysis (as opposed to orthogonal rotations) > which do not preserve the angles - and I had the impulse to exclude > this case verbally... So this remark could just be deleted > >>> ...So we can rotate the vector model X >>> (columnwise) first such, that >>> sensor 1 defines the x-axes, >>> sensor 2 and 1 define the x-y-plane >>> sensor 3 to 1 define the x-y-z-space >>> and so on. > >> I don't quite follow what you mean by "rotat[ing] the vector >> [model] X columnwise". If you interpret each column of X as a >> point (or vector) in n-space, we get what you describe (sensor 1 is >> the x-axis, sensor 2 is the y-axis, etc.). However, a rotation is >> not needed for this. > > If we speak of the n-dimensional space, each column represent the > coordinates on one axis. Then each row represents one vector > (from the origin) to some point in this n-dimensional space: for > each sensor there is one wire from the origin into the n-space, > and the angles between that wires (more precisely: the cosines of > that angles) are expressed by the correlation-coefficients. That > view of statistical data may be somehow unusual - but it is coherent > with the operations of rotations and the finding of principal > components - and this is what your matrix Vt stands for. > >>> In effect, that rotation provides a matrix X1 which is triangular >>> with as many nonzero-columns as the rank of the matrix is (and we >>> assume for simplicityness, that it equals m) > >> I think I'm missing something fundamental...the data matrix is not >> triangular, though the (n)x(n) covariance matrix (Xt)(X) is >> symmetric. > > No, not the data matrix X. But after X is rotated to triangular > position by t1 then > X1 = X * t1 > is lower triangular (with some empty columns due to the defective > rank of X)
What is meant by rotating to triangular position? Do you mean geometric position, or that X somehow becomes a triangular matrix by rearranging its columns? What if there are not enough properly placed zeros for that to be possible?
>>> Then the matrix X1 can be rotated to the position of their >>> principal components (we're talking already of the nonzero columns >>> only), let's call this X2 >> >> I see that the data must be rotated so that the principal axes >> align with the axes of m-space (not n-space), and then the diagonal >> matrix Sigma performs the anisotropic axial stretching. > > No, again we rotate in the columns/the n-space. Just we apply the > (costly because of iterations) rotation to orthogonality (which > gives principal components) only to the first m axes in X1 (which is > already triangular with only m significant columns) > > X2 = X1 * t2 > or equivalently > X2 = X * t1 * t2 = X * (t1 * t2) = X * Vt > > After that X2 contains the coordinates of your sensor-measures > after rotation in the n-space in such a way that in the first > column the sum of squared coordinates is the maximum possible > and in the m'th column the least possible and because > X2 ' * X2 is diagonal we may say, that the columns are orthogonal > >>> That two rotations together form your matrix Vt. After that, X2 >>> can be rotated by rotation of its rows to diagonal form - this is >>> your rotation-matrix W, which rotates for the principal components >>> with respect of the rows in X2 (and which is the same as the >>> rotation with respect of the rows in X). >> >> But W is not applied after Vt, > > ??? > > If we have > W * X * Vt > we can also write > W * (X * Vt) > which is meant when I say that W is applied "after" the rotation by > Vt in my example....
I got lost...the middle matrix should be Sigma, a diagonal matrix of Eigenvalues.
>> So the rotation by W is very intuitive to me, while the rotation by >> Vt is not. And as I described, it's all the more mysterious when >> you consider that X isn't actually a transformation that is applied >> to data -- it *is* the data. > > This remark "... isn't actually a transformation..." confuses now > me. ;-) Well, I understood X as data as well, I have no idea, where > the idea of "being a transformation" comes from and what I am > possibly missing here. Very likely I didn't properly catch your way > of approaching the problem...
That's the view of X = W Sigma Vt. Sigma is an anisotropic axial stretch while W rotates these stretch axes to the principal components of the data in m-space. What is never explained is what Vt rotates. In order for the rotations and stretches to apply, X=W*Sigma*Vt must be viewed as a transformation applied to a vector (or a collection of column vectors). Which means Vt is first applied, then Sigma, then W. W and Vt are orthogonal rotations.
However, X isn't a transformation that is applied to data vectors, and it is hard to imagine what vectors Vt would apply to. They would have to be in n-space, but n-space doesn't have much meaning in the context of finding correlations between the m data sets (one from each sensor).
> -------------------------------------------------------- > (...) > >> Furthermore, when I am seeking correlation between the m sensors, >> it confounds me to think about why one would picture the data >> points in n- space. As an analogy, if I am doing simple linear >> regression on a cloud of 1000 points in the x-y plane, I don't try >> to picture the data points in 1000-dimension space. > > Well, we might say, such a concept is superfluous, not needed. It > just reflects a possibilitywhich occurs when we look at the > correlation matrix and its cholesky-factors. Say, with our m x n > -datamatrix X (I use the '-apostroph for transposition) > > R = X * X' / n // R is the m x m correlation-matrix > > then we have also with some rotation W > > Z = W * R * W' // Z = Sigma = diagonal > > but also, if we see R in its cholesky-factors L and L' > > Z = W * (L * L') * W' // Z = Sigma = diagonal > > and because any rotation-matrix t postmultiplied with its transpose > is the identity > > Z = W * (L * I * L') * W' = W * (L * t * t' * L') * W' > > Now L is usually taken as m x m matrix as well, but there is no > problem to expand it by empty columns to make a m x n matrix > out of it and then to assume t such that > > L * t = X / sqrt(n) > > and then rewrite: > > Z = W * (L * t * t' * L') * W' = W * (X * t' * t * X')/n * W' > > where again (X * t' * t * X')/n = X * X' /n = R shows the > identity of the solutions. > >>> [24] t1 = gettrans(X,"Drei") >>> t1 : >>> 0.0856 0.0449 0.3898 0.6802 -0.4701 -0.3937 >>> 0.0929 0.0538 -0.1865 -0.1958 0.3348 -0.8963 >>> -0.8486 0.1986 0.1513 0.2615 0.3856 -0.0206 >>> -0.0516 -0.6916 -0.5339 0.4630 0.1392 0.0151 >>> 0.3812 -0.2498 0.5843 0.1452 0.6459 0.1125 >>> 0.3405 0.6441 -0.4049 0.4418 0.2858 0.1685 > >> Sorry, I tried to google gettrans, but wasn't able to find much >> beyond the fact that it is a column rotation. It's not clear to me >> what is meant by that. Consequently, I wasn't able to follow the >> rest of the example. > > With the given parameters X and "Drei" (="triangular") it calls the > procedure, which returns that rotation-matrix, which can rotate X to > lower triangular shape. Having it stored as an explicite matrix we > can apply this rotation and also revert it and furtherly do anything > we want with it. > > If you are using windows, you can even download that MatMate-program > and do the steps yourself (and possibly experiment further) See my > software-pages http://go.helms-net.de/sw/matmate. It's an amateurish > program, however working nice for me, but if some installation > problems occur (which is easily possible) let me know.
|
|