|
|
Re: Re your questions about the plots sent off-line (and the underlying data posted here 12/13 at 10:33am)
Posted:
Dec 14, 2012 3:48 AM
|
|
On Dec 13, 9:45 pm, djh <halitsk...@att.net> wrote: > You wrote: > > ?What are the 1..36? All the other values are monotone increasing. > Did they come that way, or did you sort them? > > The best way to see the difference between the plots is to take cols > 2 & 3 as x & y coordinates, then plot the points along with a line > from (0,0) to (1,1). The S-plot is mostly below the line. the C-plot > is mostly above. I'm not as struck by that difference as you seem to > be. Where did the numbers come from?? > > Answers > > 1. The 1...36 are irrelevant if the data are plotted the way you > suggest ? they were just a way of giving Excel an x-axis to plot > against. And thanks very much for the suggestion as how to plot in > cases like this ? of course it never would have occurred to me to do > it that way, and I was delighted to see that Excel lets you do it > pretty easily (for a Microsoft-owned product, that is.)
You should seriously consider a real plotting program such as http://www.gnuplot.info/
> > 2. Yes ? columns 2 and 3 were sorted. > > 3. Here?s where the numbers came from. > > Recall that: > > a) the fold x subset ?het? data which I presented for Aubuqe on L at > MoSS N, set 1: > > Slopes of Regressions of > Aubqe on Length (L) for each > Fold x Subset | > Set 1, Method N > Fold x Slope > Subset | # of > Set 1 of Aubqe > Meth N L?s on L > a3_S_1_N 70 -0.000188 > c1_C_1_N 101 -0.000026 > a3_C_1_N 48 0.000052 > c1_S_1_N 101 0.000266 > c2_S_1_N 96 0.000421 > c2_C_1_N 95 0.000550 > b47_C_1_N 99 0.000618 > a1_S_1_N 101 0.001069 > b47_S_1_N 99 0.001079 > b1_S_1_N 31 0.001119 > b1_C_1_N 28 0.002015 > a1_C_1_N 101 0.002210 > > were selected (because of their low associated ?het? p) from the fold > x subset data for the regression Aubque on L computed for ALL six > combinations of Set x MoSS. > > b) to get all the fold x subset Aubque on L data for all combinations > of Set x MoSS, we obviously had to first regress c on (e,u,u*e,u^2) at > each Len x Set x MoSS x Fold x Subset.
You seem to switch willy-nilly between Aubuqe, Aubqe, and Aubque. How do they differ?
> > Call this entire set of underlying data for c on (e,u,u*e,u^2) the > ?Rubq-base?, and instead of the computing the regression Aubque on L > over the entire Rubq-base, compute the regression ueSlope on (ubar, > ebar) over the entire Rubq-base , where: > > i) ueSlope is the slope of the u*e term in c on (e,u,u*e,u^2);
Do you mean the coefficient of u*e?
> > ii) ubar is the mean of ?u? (=u/(1+u) at each L and ebar is the mean > of ?e? at each L. > > From each computation of ueSlope on (ubar, ebar) we have a pair of > slopes with a pair of associated probabilities, and therefore across > all combinations of Set x MoSS x Fold x Subset, we have 72 such pairs > of probabilities, or 144 probabilities in all.
What is the "computation of ueSlope on (ubar, ebar)"? How do you get a pair of p's from it?
> > DISREGARDING Fold and Set, divide these 144 probabilities into four > groups: > > 36 at subset S, Method N > 36 at subset C, Method N > 36 at subset S, Method R > 36 at subset C, Method R > > Sort each of these groups independently (lowest to highest p), and > then pair off elements of these four groups as follows: > > pair off the 36 from S,N with the 36 from C,N by corresponding rank > (from the sort of each group) > > pair off the 36 from S,R with the 36 from C,R by corresponding rank > (from the sort of each group)
Regardless of the answers to my previous questions, you can't split naturally paired p's, sort them, re-pair them, and then compare the re-paired p's -- which you shouldn't compare in the first place, even without the shuffling, because p-values are NOT effect sizes.
> > (Note (!!!!) that these pairings are DIFFFERENT (!!!) from the > pairings of (S,N) with (S,R) and (C,N) with (C,R) which I presented in > my post of 12/13@12:33.) > > You will then have these two tables of paired p?s (and the associated > plot ?done your way?, which I?ve sent offline): > > SN,CN > > 0.004293565,0.000147868 > 0.009398,0.000235407 > 0.019790086,0.002576217 > 0.021645402,0.020854486 > 0.041148681,0.023919 > 0.056848093,0.041120964 > 0.169920851,0.042472596 > 0.236373,0.059794 > 0.248019846,0.079939524 > 0.277783068,0.087268176 > 0.281488299,0.13125994 > 0.287886,0.17489924 > 0.299769,0.180724763 > 0.299875026,0.185042614 > 0.360314613,0.207785097 > 0.370746358,0.21197145 > 0.406029587,0.228176227 > 0.43289,0.252242125 > 0.465398176,0.275296878 > 0.482382234,0.305134999 > 0.530897822,0.309388442 > 0.559333624,0.332112292 > 0.626424347,0.361024514 > 0.702399,0.41780334 > 0.741387901,0.423432022 > 0.768317356,0.476818276 > 0.820922877,0.542145 > 0.831159936,0.559098289 > 0.832584062,0.581960315 > 0.88900441,0.619627105 > 0.893789589,0.646265173 > 0.894253162,0.74717756 > 0.935126553,0.757530416 > 0.977748076,0.884119 > 0.980182674,0.900867429 > 0.984220184,0.938430375 > > SR,CR > 0.000503944,0.00011982 > 0.00118415,0.012214573 > 0.041027523,0.029133944 > 0.052112332,0.048936138 > 0.054021335,0.05764761 > 0.057693811,0.05865896 > 0.068659527,0.064182305 > 0.083710757,0.088376406 > 0.094021303,0.107473805 > 0.130456898,0.147682873 > 0.21540961,0.162392478 > 0.236780945,0.181759433 > 0.236936513,0.201847347 > 0.269875322,0.210439736 > 0.294476424,0.226305355 > 0.315561395,0.227038784 > 0.319462902,0.255699197 > 0.327971706,0.288864935 > 0.463861812,0.302035139 > 0.479255866,0.312164668 > 0.564392402,0.388447922 > 0.577382726,0.397416524 > 0.579430243,0.434182601 > 0.588970805,0.438280224 > 0.61542756,0.516128733 > 0.629984706,0.614130775 > 0.698570658,0.675962212 > 0.719544247,0.689950901 > 0.732798731,0.735779895 > 0.813873971,0.778392333 > 0.883957837,0.800207872 > 0.888276157,0.870729822 > 0.888377668,0.911149831 > 0.917545651,0.93512393 > 0.977990461,0.941162349 > 0.980356048,0.986071449 > > So, depending on one?s ?IOT reaction? to the plot I?ve sent offline > for the two tables above, one might be willing to say that in general, > CN p?s plot significantly lower than CR p?s for equivalent SN?s and > SR?s. > > And this result, assuming you?re willing to accept it, is extremely > important for the following reason. > > It says that regardless of dicodon set 1,2,3, the (S,N) subsets > ?evolved/were designed? (depending on your point of view ? heh heh > heh) so that mutation away from these sets to (C,N) sets does NOT > change the predictive capacities of ubar and ebar in ueSlope on (ubar, > ebar) as much as the predictive capacities of ubar and ebar in ueSlope > on (ubar, ebar)are changed by the mutation of (S,R) sets to (C,R) > sets. > > Or, to boil that statement down even further, the result says that we > have found a (relative) INVARIANT UNDER MUTATION for (S,N) sets that > does NOT exist for (S,R) sets. And the existence of this invariant > strongly suggests that the (S,N) subsets of dicodon sets 1,2,3 all > evolved to keep certain thermodynamic properties of protein messasges > relatively constant despite the mutation which these messages must > perforce undergo over time. > > Finally, apart from this empirical interpretation of the plot I?ve > sent off line, I have a ?feeling? that the facts above regarding > ueSlope on (ubar,ebar) must be related somehow to the facts we?ve been > discussing regarding Aubqe on L. But if you agree, then the ball is > now in your court for the obvious reason that I have neither the > knowledge nor experience nor statistical brain-power to determine if > ueSlope on (ubar,ebar) and Aubqe on L are related, and if so how ...
|
|