
Re: Re your questions about the plots sent offline (and the underlying data posted here 12/13 at 10:33am)
Posted:
Dec 14, 2012 3:48 AM


On Dec 13, 9:45 pm, djh <halitsk...@att.net> wrote: > You wrote: > > ?What are the 1..36? All the other values are monotone increasing. > Did they come that way, or did you sort them? > > The best way to see the difference between the plots is to take cols > 2 & 3 as x & y coordinates, then plot the points along with a line > from (0,0) to (1,1). The Splot is mostly below the line. the Cplot > is mostly above. I'm not as struck by that difference as you seem to > be. Where did the numbers come from?? > > Answers > > 1. The 1...36 are irrelevant if the data are plotted the way you > suggest ? they were just a way of giving Excel an xaxis to plot > against. And thanks very much for the suggestion as how to plot in > cases like this ? of course it never would have occurred to me to do > it that way, and I was delighted to see that Excel lets you do it > pretty easily (for a Microsoftowned product, that is.)
You should seriously consider a real plotting program such as http://www.gnuplot.info/
> > 2. Yes ? columns 2 and 3 were sorted. > > 3. Here?s where the numbers came from. > > Recall that: > > a) the fold x subset ?het? data which I presented for Aubuqe on L at > MoSS N, set 1: > > Slopes of Regressions of > Aubqe on Length (L) for each > Fold x Subset  > Set 1, Method N > Fold x Slope > Subset  # of > Set 1 of Aubqe > Meth N L?s on L > a3_S_1_N 70 0.000188 > c1_C_1_N 101 0.000026 > a3_C_1_N 48 0.000052 > c1_S_1_N 101 0.000266 > c2_S_1_N 96 0.000421 > c2_C_1_N 95 0.000550 > b47_C_1_N 99 0.000618 > a1_S_1_N 101 0.001069 > b47_S_1_N 99 0.001079 > b1_S_1_N 31 0.001119 > b1_C_1_N 28 0.002015 > a1_C_1_N 101 0.002210 > > were selected (because of their low associated ?het? p) from the fold > x subset data for the regression Aubque on L computed for ALL six > combinations of Set x MoSS. > > b) to get all the fold x subset Aubque on L data for all combinations > of Set x MoSS, we obviously had to first regress c on (e,u,u*e,u^2) at > each Len x Set x MoSS x Fold x Subset.
You seem to switch willynilly between Aubuqe, Aubqe, and Aubque. How do they differ?
> > Call this entire set of underlying data for c on (e,u,u*e,u^2) the > ?Rubqbase?, and instead of the computing the regression Aubque on L > over the entire Rubqbase, compute the regression ueSlope on (ubar, > ebar) over the entire Rubqbase , where: > > i) ueSlope is the slope of the u*e term in c on (e,u,u*e,u^2);
Do you mean the coefficient of u*e?
> > ii) ubar is the mean of ?u? (=u/(1+u) at each L and ebar is the mean > of ?e? at each L. > > From each computation of ueSlope on (ubar, ebar) we have a pair of > slopes with a pair of associated probabilities, and therefore across > all combinations of Set x MoSS x Fold x Subset, we have 72 such pairs > of probabilities, or 144 probabilities in all.
What is the "computation of ueSlope on (ubar, ebar)"? How do you get a pair of p's from it?
> > DISREGARDING Fold and Set, divide these 144 probabilities into four > groups: > > 36 at subset S, Method N > 36 at subset C, Method N > 36 at subset S, Method R > 36 at subset C, Method R > > Sort each of these groups independently (lowest to highest p), and > then pair off elements of these four groups as follows: > > pair off the 36 from S,N with the 36 from C,N by corresponding rank > (from the sort of each group) > > pair off the 36 from S,R with the 36 from C,R by corresponding rank > (from the sort of each group)
Regardless of the answers to my previous questions, you can't split naturally paired p's, sort them, repair them, and then compare the repaired p's  which you shouldn't compare in the first place, even without the shuffling, because pvalues are NOT effect sizes.
> > (Note (!!!!) that these pairings are DIFFFERENT (!!!) from the > pairings of (S,N) with (S,R) and (C,N) with (C,R) which I presented in > my post of 12/13@12:33.) > > You will then have these two tables of paired p?s (and the associated > plot ?done your way?, which I?ve sent offline): > > SN,CN > > 0.004293565,0.000147868 > 0.009398,0.000235407 > 0.019790086,0.002576217 > 0.021645402,0.020854486 > 0.041148681,0.023919 > 0.056848093,0.041120964 > 0.169920851,0.042472596 > 0.236373,0.059794 > 0.248019846,0.079939524 > 0.277783068,0.087268176 > 0.281488299,0.13125994 > 0.287886,0.17489924 > 0.299769,0.180724763 > 0.299875026,0.185042614 > 0.360314613,0.207785097 > 0.370746358,0.21197145 > 0.406029587,0.228176227 > 0.43289,0.252242125 > 0.465398176,0.275296878 > 0.482382234,0.305134999 > 0.530897822,0.309388442 > 0.559333624,0.332112292 > 0.626424347,0.361024514 > 0.702399,0.41780334 > 0.741387901,0.423432022 > 0.768317356,0.476818276 > 0.820922877,0.542145 > 0.831159936,0.559098289 > 0.832584062,0.581960315 > 0.88900441,0.619627105 > 0.893789589,0.646265173 > 0.894253162,0.74717756 > 0.935126553,0.757530416 > 0.977748076,0.884119 > 0.980182674,0.900867429 > 0.984220184,0.938430375 > > SR,CR > 0.000503944,0.00011982 > 0.00118415,0.012214573 > 0.041027523,0.029133944 > 0.052112332,0.048936138 > 0.054021335,0.05764761 > 0.057693811,0.05865896 > 0.068659527,0.064182305 > 0.083710757,0.088376406 > 0.094021303,0.107473805 > 0.130456898,0.147682873 > 0.21540961,0.162392478 > 0.236780945,0.181759433 > 0.236936513,0.201847347 > 0.269875322,0.210439736 > 0.294476424,0.226305355 > 0.315561395,0.227038784 > 0.319462902,0.255699197 > 0.327971706,0.288864935 > 0.463861812,0.302035139 > 0.479255866,0.312164668 > 0.564392402,0.388447922 > 0.577382726,0.397416524 > 0.579430243,0.434182601 > 0.588970805,0.438280224 > 0.61542756,0.516128733 > 0.629984706,0.614130775 > 0.698570658,0.675962212 > 0.719544247,0.689950901 > 0.732798731,0.735779895 > 0.813873971,0.778392333 > 0.883957837,0.800207872 > 0.888276157,0.870729822 > 0.888377668,0.911149831 > 0.917545651,0.93512393 > 0.977990461,0.941162349 > 0.980356048,0.986071449 > > So, depending on one?s ?IOT reaction? to the plot I?ve sent offline > for the two tables above, one might be willing to say that in general, > CN p?s plot significantly lower than CR p?s for equivalent SN?s and > SR?s. > > And this result, assuming you?re willing to accept it, is extremely > important for the following reason. > > It says that regardless of dicodon set 1,2,3, the (S,N) subsets > ?evolved/were designed? (depending on your point of view ? heh heh > heh) so that mutation away from these sets to (C,N) sets does NOT > change the predictive capacities of ubar and ebar in ueSlope on (ubar, > ebar) as much as the predictive capacities of ubar and ebar in ueSlope > on (ubar, ebar)are changed by the mutation of (S,R) sets to (C,R) > sets. > > Or, to boil that statement down even further, the result says that we > have found a (relative) INVARIANT UNDER MUTATION for (S,N) sets that > does NOT exist for (S,R) sets. And the existence of this invariant > strongly suggests that the (S,N) subsets of dicodon sets 1,2,3 all > evolved to keep certain thermodynamic properties of protein messasges > relatively constant despite the mutation which these messages must > perforce undergo over time. > > Finally, apart from this empirical interpretation of the plot I?ve > sent off line, I have a ?feeling? that the facts above regarding > ueSlope on (ubar,ebar) must be related somehow to the facts we?ve been > discussing regarding Aubqe on L. But if you agree, then the ball is > now in your court for the obvious reason that I have neither the > knowledge nor experience nor statistical brainpower to determine if > ueSlope on (ubar,ebar) and Aubqe on L are related, and if so how ...

