Paul
Posts:
263
Registered:
2/23/10


Re: SVD for PCA: The right most rotation matrix
Posted:
Nov 6, 2012 2:10 AM


>Am 04.11.2012 20:04 schrieb Paul: >>Gottfried wrote: >>> No, gettrans is just a functioncall in my >>> MatMatescriptlanguage, which returns a rotationmatrix. For >>> instance, by the command: >>> >>> t1 = gettrans(X,"drei") // "drei" means "triangular" >>> >>> t1 becomes the rotationmatrix, which is required to rotate >>> columnwise... >> >> I am still not sure what is a columnwise rotation. Do you actually >> switch columns around, or is it more like a geometric rotation? > > If you have a matrix X and postmultiply a rotationmatrix, for > instance we look at two columns of X only and apply a > columnrotation by the rotationmatrix t consisting of > cos/sinvalues: > >  x1 y1   cos(phi) sin(phi)  >  x2 y2  *  sin(phi) cos(phi)  >  x3 y3  >  ... ... >  xm ym  > > If phi is chosen such that the columns are then > >  x1' 0  >  x2' y2'  >  x3' y3'  >  ... ...  >  xm ym  > > this is a rotation to "triangular position". Clearly, if we do > not look at two columns only, we can have a rotation of > X to triangular position/shape as given in an earlier example > >> and we apply that rotation to X to get the sensor's data in that >> rotated coordinates in matrix X1 >> [25] X1 = X*t1 >> X1 : >> 2.4495 0.0000 0.0000 0.0000 0.0000 0.0000 >> 0.4790 2.4022 0.0000 0.0000 0.0000 0.0000 >> 1.0680 1.5080 1.6079 0.0000 0.0000 0.0000 >>******************************************** >> we see, that we need only a m=3dimensional space to account for >> coordinates of 3 (linearly independent) vectors.
This is very...curious. I never knew that you could do that. I am operating off of highschool algebra when it comes to linear transformations e.g. http://en.wikipedia.org/wiki/Transformation_matrix#Rotation . There, a rotation is effected by left multiplying a point in space by a rotation matrix like the one you provide above (but transposed, which is the inverse of a unitary matrix). Is there an online reference for the theory that explains the rightmultiplication and the reason why it yields a triangular matrix? The right multiplication by a linear transformation matrix is very unnatural and unintuitive to me right now.
> Here X1 has a triangular shape (only the lower triangle is nonzero). > Because we talk of vectors in the ndimensional space, and I take > the vectors as a "wiremodel" I also say "triangular position" > because the wiremodel was rotated such that the first wire > (representing the first sensor) lays on the xaxis, the second wire > (representing the second sensor) lays in the xy plane and so on. > >> What is meant by rotating to triangular position? Do you mean >> geometric position, or that X somehow becomes a triangular matrix >> by rearranging its columns? What if there are not enough properly >> placed zeros for that to be possible? > > Is it now understandable?
Yes, though it's new to me. I'm definitely feeling that I'm missing some theory.
>  Show quoted text  > > Ok, I see: I've mixed the formulas. You've given > > X=(W)(Sigma)(Vt) > > So I should always have taken > > Sigma = W^1 * X * Vt^1 > > or, because W and Vt are orthogonal/rotations : > > Sigma = Wt * X * V > > So this should be replaced in my examples. However, my goal > was only to make understandable the relevance of the nspace > and th rotation in the nspace, and that the idea of correlations > between the m sensors is just identical to the idea of that > wiremodel in the nspace, having angles between them whose > cosines are just the correlations.
Well that's exactly it...the correlation (or rather, the covariance) between two sensors is the dot product between the two nlength sequences of data collected by two sensors. That is W. Not Vt. Vt doesn't seem to have any intuitive meaning. Each entry is a correlation between a set of readings between 2 points in time. Each vector in the dotproduct is a set of readings from all m sensors at a point in time. If I was manually looking for correlation between sensors, this correlation between samples in time seems to have no relevance.
>> That's the view of X = W Sigma Vt. Sigma is an anisotropic axial >> stretch while W rotates these stretch axes to the principal >> components of the data in mspace. What is never explained is what >> Vt rotates. In order for the rotations and stretches to apply, >> X=W*Sigma*Vt must be viewed as a transformation applied to a vector >> (or a collection of column vectors). Which means Vt is first >> applied, then Sigma, then W. W and Vt are orthogonal rotations. >> >> However, X isn't a transformation that is applied to data vectors, >> and it is hard to imagine what vectors Vt would apply to. They >> would have to be in nspace, but nspace doesn't have much meaning >> in the context of finding correlations between the m data sets (one >> from each sensor). > > Hmm, perhaps you should begin to look from the cosine between > two vectors, say A and B. > If in a multidimensional, say 10dimensional, space the point "a" > has the coordinates [4,5,1,4,2,3,3,1,5,4] and the point "b" the > coordinates [2,3,7,9,1,4,6,6,4,2] then the vectors A, pointing > from the origin to "a", and B, pointing to "b", have the angle > between them whose cosine is determined by > > (A * Bt) 4*2+5*3+1*7+4*9+2*1+3*4+3*6+1*6+5*4+4*2 >  =  > l1 * l2 l1*l2 > > (where l1 and l2 are the "lengthes" of A and B.) > > So we can put in the 10dimensional space the two vectors/wires > coming from the origin, pointing to "a" and "b". That two vectors > can now freely be rotated in that space, for instance such, that > A matches the xaxis, and that B lies then in the xyplane. This > roation does not change the angle between them.
Yes, I realize that a unitary matrix preserves angles and distances between vectors (from the origin to the point in space represented by the vector). The thing is, for m sensors, the natural view to take is to view the cloud of data points in mspace and look for a straight line relationship for the first principal component. Not in nspace. Hence my confusion about the role of Vt.
> But the above formula is the same which we also use to calculate the > correlation, if the coordinates are taken as datasets/measures. > (Note, that in the example I've taken data, such that their mean is > zero for this example to work  the wiremodel /correlation analogy > is only usable if the data are centered)
Yes, it has to be zero mean. In most cases, the data along each axis (before PCA) is also scaled so that the standard deviation is 1. Otherwise, the sensor with the greatest swing in readings will dominate in determining the principal component. This is not meaningful since the sensors might not even be making the same kind of physical measurement.

