On Nov 16, 6:24 pm, djh <halitsk...@att.net> wrote: > I suspected there was going to be a need to map into [0,1], but didn?t > want to raise the question until I knew whether you were going to > dismiss the suggestion as unworkable for other reasons. > > Regarding c, here are the two salient points: > > a) we ignore strings in which c is 0; > > b) I can guarantee we will never examine a message segment containing > more than 253 codons, and therefore 252 dicodons (assuming we allow > overlapping dicodons, which we do); > > So, c can go from 1 to 252, where 252 is reached if every dicodon > c(i)c(i+1) for 1<=i<251 is in the set of dicodons with which we?re > working. > > Regarding u, suppose that: > > c) our dicodon set contains the L codon ctc and the S codon agt; > > d) in our string S of 252 overlapping dicodons c1...c252, c(i) is > always ctc and c(i+1) is always agt for i odd. > > Then inasmuch as there are 6 L codons and 6 S codons, there are 6*6 > different dicodons signifying LS and 6*6 different dicodons signifying > SL. And therefore, we would expect the LS dicodon ctcagt to occur > 126/36 = 3.5 times in our string, and the SL dicodon agtctc to occur > 126/36 = 3.5 times in our string. But since we insisted on (d) above, > the LS dicodon ctcagt actually occurs 126 times in our string, and the > SL dicodon agtctc actually occurs 126 times in our string. > > So ezch of these dicodons occurs 126/3.5 times more than expected, and > therefore u for our string is 36. > > And inasmuch as no dicodon for any dipeptide has an expectancy LESS > than 1/36, we can rightfully say that: > > e) the lowest possible u is 1/126 = .0079 > > f) the highest possible u is 126/3.5 = 36. > > Assuming you agree on all of the above, I?m afraid you?ll have to tell > me how to map each of > > e: [221.735, 308.65] > c: [1,252] > u: [.0079.,36] > > into [0,1]. > > (Not being snippy here ... just confessing my usual feckless ignorance > on this particular matter.) > > By the way, permit me to correct an egregious typo in my previous post > regarding the orthogonal projection. > > This: > >> Orthogonal projection of this coordinate system onto the plane Peuc >> thru (1,0,0), (0,1,0), (0,0,1) will take: >> >> e,0,0 into the point Pe = ( (1/sqrt6)e, (-sqrt2/2)e ) >> 0,c,0 into the point Pc = ( (-sqrt(2/3))e, 0 ) >> 0,0,u into the point Pu = ( (1/sqrt6)e, (+sqrt2/2)e ) > > should of course have been: > >> Orthogonal projection of this coordinate system onto the plane Peuc >> thru (1,0,0), (0,1,0), (0,0,1) will take: >> >> e,0,0 into the point Pe = ( (1/sqrt6)e, (-sqrt2/2)e ) >> 0,c,0 into the point Pc = ( (-sqrt(2/3))c, 0 ) >> 0,0,u into the point Pu = ( (1/sqrt6)u, (+sqrt2/2)u ) > > (I assume you simply recognized and ?read-thru? this error, but I > wanted to correct it for the record nonetheless.)
1. What you are suggesting is equivalent to rotating the axes using the orthonormal transformation
[ a b -c ] T = [ a -2b 0 ], [ a b c ]
where a = -sqrt(1/3), b = sqrt(1/6), c = sqrt(1/2).
You discard the first column of T. Everything below holds with or without the first column.
[ Pe ] [ e 0 0 ] [ Pc ] = [ 0 c 0 ] . T [ Pu ] [ 0 0 u ]
where . denotes matrix multiplication (Mathematica notation).
The centroid of those three points is [e c u].T/3, and the average centroid of any set of such points is [ebar cbar ubar].T/3, where [ebar cbar ubar] is the centroid of the points in terms of the original coordinates.
The same thing happens if you look at only pairs (e,c),(e,u),(c,u), except the divisor 3 changes to a 2.
So all you're suggesting is to look at some linear transformations of the centroids of the points at each L. That can never tell you anything about the within-L relations or if/how they change with L.
2. When I asked how you were mapping e,c,u into [0,1], I had misread your post as saying that you had done that. It isn't necessary. What is necessary is that the values in each column of T imply a linear combination that makes sense. Mapping x into [0,1] by x' = (x - x_min) / (x_max - x_min) does not necessarily make x' comparable to y' = (y - y_min) / (y_max] - y_min) for some other variable y.
The same holds for more complicated transformations. In general, any transformation that is purely numerically driven, ignoring what the numbers represent, is suspect.