Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: Interpretation of coefficients in multiple regressions which model
linear dependence on an IV

Replies: 146   Last Post: Dec 15, 2012 6:44 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Ray Koopman

Posts: 3,382
Registered: 12/7/04
Re: Response to your last
Posted: Dec 8, 2012 6:32 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Dec 7, 6:07 pm, djh <halitsk...@att.net> wrote:
> [...]
>
> II. You wrote:
>
> 2. Something's wrong somewhere. Those p's are too similar to one
> another, and are too large to be consistent with the other results
> you've been reporting.
>
> No ? it?s just that the ?good? and ?great? p?s for u^2 are very length-
> specific, as is shown by the following table for u^2 in regression c
> on (e,u,u*e,u^2) for Len | subset = S, method = N, fold = a1, set = 1.
> (Note that this table is sorted by increasing p.)
>
> So the question posed by the following table is the same basic
> question I actually asked several posts ago, namely: for (S, N, a1,
> 1), do we have ENOUGH ?good? and ?great? p?s to claim that the model c
> on (e,u,u*e,u^2) ?works? in a sufficient number of cases to ?keep? it,
> at least for the factor combination S, N, a1, 1) ?
>
> Also, please note that similar tables exist for all of the factor
> combinations equivalent to (S, N, a1, 1), so is it possible we should
> actually be comparing the distributions of p for u^2 from all these
> different factor combinations ... to see which distributions of p are
> ?left? of others and ?right? of others in the horizontal sense (i.e.
> with p as the x-axis)?
>
> u^2 (t, df, p) Table: t, df, and p values for u^2 in regression c on
> (e,u,u*e,u^2) for Len | subset=S, method= N, fold=a1, set=1
>
> Len t df p
>
> 71 3.930 24 0.00063
> 26 3.434 44 0.00131
> 122 3.565 16 0.00258
> 24 3.162 47 0.00274
> 27 3.101 58 0.00297
> 110 3.396 16 0.00369
> 101 3.179 19 0.00494
> 35 2.870 59 0.00569
> 84 2.460 27 0.02058
> 109 2.462 25 0.02108
> 25 2.343 66 0.02216
> 73 2.185 31 0.03654
> 69 1.989 34 0.05474
> 62 1.988 24 0.05828
> 49 1.922 39 0.06193
> 55 1.929 31 0.06294
> 44 1.733 35 0.09186
> 37 1.667 68 0.10004
> 28 1.635 64 0.10697
> 54 1.639 33 0.11063
> 94 1.638 22 0.11567
> 41 1.616 32 0.11598
> 29 1.564 74 0.12219
> 30 1.546 64 0.12705
> 60 1.533 34 0.13462
> 75 1.510 20 0.14672
> 33 1.464 54 0.14893
> 66 1.451 35 0.15580
> 52 1.404 38 0.16830
> 74 1.394 25 0.17562
> 50 1.240 40 0.22236
> 32 1.216 47 0.22989
> 67 1.186 40 0.24280
> 63 1.147 28 0.26105
> 38 1.084 38 0.28513
> 53 1.065 33 0.29463
> 40 1.053 46 0.29789
> 68 1.064 19 0.30072
> 77 0.998 28 0.32687
> 58 0.989 32 0.32996
> 76 0.950 22 0.35222
> 48 0.873 38 0.38816
> 43 0.860 33 0.39616
> 80 0.807 31 0.42564
> 46 0.766 30 0.44947
> 87 0.717 17 0.48337
> 56 0.679 31 0.50249
> 45 0.677 29 0.50349
> 83 0.659 19 0.51765
> 96 0.644 23 0.52619
> 59 0.537 24 0.59645
> 61 0.490 39 0.62669
> 36 0.454 57 0.65159
> 39 0.443 30 0.66063
> 65 0.424 21 0.67621
> 120 0.390 16 0.70203
> 95 0.325 12 0.75075
> 51 0.288 45 0.77443
> 108 0.270 14 0.79079
> 31 0.234 65 0.81572
> 90 0.169 14 0.86841
> 111 0.124 18 0.90264
> 34 0.078 73 0.93820
> 47 0.065 45 0.94811
> 89 0.061 11 0.95249
> 42 0.002 31 0.99881
>
> III. You wrote:
>
> ?In particular, you should not be considering any results from
> regressing c on (u,u^2) if e matters?.
>
> I'm sorry to plead ignorance but nothing you've ever posted before has
> prepared me to understand you here at all. What I mean by this is the
> following.
>
> From the beginning we have been using a regression involving e, a
> regression involving u, and a regression involving (e,u) IN CONCERT,
> NOT as mutually exclusive alternatives.
>
> First we had:
>
> 1a) ln(c/L) on ln(c/e)
> 1b) ln(c/L) on ln(c/u)
> 1c) ln(c/L) on (ln(c/e), ln(c/u))
>
> Then, because of your reservations about these regressions, we
> simplified to
>
> 2a)c on e
> 2b)c on u
> 2c)c on (e,u)
>
> and that actually improved matters.
>
> And then finally, because of your very remarkable intuition that the
> ?L/H? dichotomization of u should be replaced by adding u-related
> factors to the regressions themselves, we have arrived at
>
> 3a) c on (e,u,u*e), by addition of a u-factor to c on e
> 3b) c on (u,u^2), by addition of a u-factor to c on u
> 3c) c on (e,u,u*e,u^2), by addition of two u factors to c on (e,u)
>
> So ... if we never intended 1(a-c) as mutually exclusive alternatives,
> nor 2a-c as mutually exclusive alternatives, why all of a sudden do we
> have to treat (3a-3c) as mutually exclusive alternatives? Please
> recall here that the ultimate goal was always to develop predictors
> for logistic regressions, and back when we were doing logistic
> regressions, you said it?s best to throw everything into the soup that
> one can think of ... that?s why we had logistic regression predictors
> based on MORE THAN ONE linear regression.
>
> Also, why is NOT statistically legitimate to postulate that there are
> BOTH:
>
> a) a relationship between c and u that, as you suspected, is best
> expressed by c on (u,u^2) because the relationship changes with
> increasing u
>
> b) a relationship between c and e that, again as you expected, is best
> expressed by c on (e,u,u*e) because again, the relationship changes
> with increasing u.
>
> [...]


Let me focus initially on 2a-c: the regressions of c on e, on u,
and on (e,u). There are two problems. First, c is a count, with no
measurement error, but both e and u contain measurement error. The
usual regression model, that we have been using all along, assumes
the opposite: that the predictors are known exactly, and that only
the d.v. contains measurement error. (I mentioned this in a post on
Oct 25 @ 12:54 pm.) However, I have been (and still am) willing to
ignore this problem because I believe the measurement errors are
probably negligible compared to random sampling error.

The other problem is something that I thought I had mentioned before,
but apparently I never got beyond thinking about it. If you wanted
the results of 2a-c for purely descriptive purposes, or to use as
input for other computations, then I would see nothing wrong with
doing all three. The problem comes when you ask for p-values. Then
you need to specify a probability model, and the models for 2a-c are
mutually exclusive (except in special cases, such as when at least
one of the regression coefficients in 2c is zero).

We have been using the "conditional regression" model: for 2c,
it says that for every (e,u) pair in the domain of interest,
c|(e,u) = a0 + a1*e + a2*u + error, where the errors are independent
identically-distributed zero-mean normal random variables. There are
no distributional assumptions about (e,u}; their values are taken to
be given, arbitrary. If this model holds then neither 2a nor 2b can
hold, and so we can not get p-values for their coefficients.

One way to legitimize p-values for 2a-c would be to switch to a
completely random model, in which the sample triples (c,e,u) are
assumed to come from a trivariate normal distribution. (The trivariate
normal model is equivalent to augmenting the conditional regression
model with the assumption that the sample pairs (e,u) come from a
bivariate normal distribution.) However, that would rule out 3a-c,
because all the regressions in any multivariate normal distribution
are purely linear; there are no product terms or squared terms.

A plot of the ordered p's from point II against their ranks is
sufficiently different (by the IOT test) from plots of ordered random
Uniform[0,1] variables against their ranks to allow the conclusion
that the coefficient of u^2 is generally nonzero when subset=S,
method=N, fold=a1, set=1. Accordingly, I see no defensible way to
attach p-values to coefficients in models that omit u^2 in that cell.



Date Subject Author
11/21/12
Read Interpretation of coefficients in multiple regressions which model
linear dependence on an IV
Halitsky
11/21/12
Read The problematic regression is actually ln(c) on ( ln(u), ln(u^2) ),
not c on (u, u^2)
Halitsky
11/22/12
Read Re: The problematic regression is actually ln(c) on ( ln(u), ln(u^2)
), not c on (u, u^2)
Ray Koopman
11/22/12
Read Off-line Zip File with one Summ File and 12 Detl files for lnc on (lnu,(lnu)^2)
Halitsky
11/23/12
Read Re: Off-line Zip File with one Summ File and 12 Detl files for lnc on (lnu,(lnu)^2)
Ray Koopman
11/23/12
Read Re: Off-line Zip File with one Summ File and 12 Detl files for lnc on (lnu,(lnu)^2)
Halitsky
11/23/12
Read Complete "a1_N_1_S" zipfile with results from all 3 new regressions
Halitsky
11/24/12
Read Re: Complete "a1_N_1_S" zipfile with results from all 3 new regressions
Ray Koopman
11/24/12
Read Re: Complete "a1_N_1_S" zipfile with results from all 3 new regressions
Halitsky
11/24/12
Read You now have N_1_S, N_2_S, and N_3_S files for all folds
Halitsky
11/25/12
Read As per your suggestion in the other thread, scaled e on scaled u, c, L
Halitsky
11/26/12
Read Re: As per your suggestion in the other thread, scaled e on scaled u,
c, L
Ray Koopman
11/26/12
Read Re: Interpretation of coefficients in multiple regressions which
model linear dependence on an IV
Ray Koopman
11/26/12
Read Them there is some neat algebraic mechanics !
Halitsky
11/27/12
Read Re: Them there is some neat algebraic mechanics !
Ray Koopman
11/27/12
Read OK – I think I’m set, at least till we get to c
on (e, u, u*e).
Halitsky
11/27/12
Read Re: OK – I think I’m set, at least till we get t
o c on (e, u, u*e).
Ray Koopman
11/28/12
Read Re: OK – I think I’m set, at least till we get t
o c on (e, u, u*e).
Ray Koopman
11/28/12
Read Thanks for your review of Tables I/II from previous analysis
Halitsky
11/27/12
Read Holy Cow! Look at your "average a1" slope regressed on Len Int
Halitsky
11/27/12
Read Re: Holy Cow! Look at your "average a1" slope regressed on Len Int
Ray Koopman
11/27/12
Read Re: Holy Cow! Look at your "average a1" slope regressed on Len Int
Halitsky
11/27/12
Read Re: Holy Cow! Look at your "average a1" slope regressed on Len Int
Ray Koopman
11/27/12
Read Here's how I did logs ...
Halitsky
11/27/12
Read Please note that $u = u in last post (the $ prefix is from PERL - sorry).
Halitsky
11/27/12
Read Re: Here's how I did logs ...
Ray Koopman
11/28/12
Read Average slopes and means of u' for c on (u',u'^2) WITHOUT logs
Halitsky
11/28/12
Read Results (!!) on average slopes and means for a1_N_1_C (complement
instead of core subset)
Halitsky
11/28/12
Read Re: Results (!!) on average slopes and means for a1_N_1_C (complement
instead of core subset)
Ray Koopman
11/28/12
Read Finally! Pay-off for all that work I did with the "A" matrix returned
by Ivor Welch's module!
Halitsky
11/29/12
Read Average Slope SEs for a1_N_1_S and a1_N_1_C (and some questions
regarding them ...)
Halitsky
11/30/12
Read Re: Average Slope SEs for a1_N_1_S and a1_N_1_C (and some questions
regarding them ...)
Ray Koopman
12/2/12
Read Re: Average Slope SEs for a1_N_1_S and a1_N_1_C (and some questions
regarding them ...)
Ray Koopman
12/2/12
Read Re: Average Slope SEs for a1_N_1_S and a1_N_1_C (and some questions
regarding them ...)
Ray Koopman
12/2/12
Read Glad you brought up “singleton” length intervals
... been thinkin’ on ‘em also ...
Halitsky
12/2/12
Read Re: Glad you brought up “singleton” length inter
vals ... been thinkin’ on ‘em also ...
Ray Koopman
12/2/12
Read It's still 24...124 - don't know why I bothered to say "roughly
25...125" instead of "exactly "24...124"
Halitsky
12/2/12
Read You should probably clear your data deck and start fresh with the two
csv's I just mentioned in the last email
Halitsky
12/2/12
Read Re: Glad you brought up “singleton” length inter
vals ... been thinkin’ on ‘em also ...
Halitsky
12/2/12
Read One last thought: definitions for the third regression (will save a
complete re-run if I incorporate them now) ...
Halitsky
12/3/12
Read Number of Bonferroni entries for each singleton length is still 72 (duh!)
Halitsky
11/30/12
Read En passant question: What if a plot of slope CI’s
is lousy, but splits the “m’s” perfectly?
Halitsky
11/30/12
Read Re: En passant question: What if a plot of slope CI
’s is lousy, but splits the “m’s” perfectly?
Ray Koopman
12/1/12
Read I’m glad the perfect m split legitimately suggests
a subset effect; here’s why.
Halitsky
12/1/12
Read Re: I’m glad the perfect m split legitimately sugg
ests a subset effect; here’s why.
Ray Koopman
12/1/12
Read Re: I’m glad the perfect m split legitimately sugg
ests a subset effect; here’s why.
Halitsky
12/1/12
Read Slope and intercept for R'uq in the above example ...
Halitsky
12/1/12
Read Bonferroni tables for p’s from new 2-ways for Auq
per fold and length interval
Halitsky
12/1/12
Read Nope! 24-entry Bonferroni tables for (a1,a3) and (b1,b47) do NOT
improve results for a3 nor b47
Halitsky
12/5/12
Read I'm VERY glad you'll know how to answer this "perms and combs"
question !
Halitsky
12/5/12
Read “L-H Het” Table for Average Slopes Auq, Aubu, Au
bqu
Halitsky
12/5/12
Read In "L-H Het table", L-H Het for N1 Aubu should be 4, NOT 2
Halitsky
12/5/12
Read Holy Moly, were you right about covariances for Rub and Rubq !!!!
Halitsky
12/5/12
Read Re: Holy Moly, were you right about covariances for Rub and Rubq !!!!
Ray Koopman
12/6/12
Read So do we need to "Bonferroni-correct" in this case
Halitsky
12/7/12
Read Re: So do we need to "Bonferroni-correct" in this case
Ray Koopman
12/7/12
Read Response to your last of 12/7 at 12:17am
Halitsky
12/7/12
Read Re: Response to your last of 12/7 at 12:17am
Ray Koopman
12/7/12
Read Thanks for the guidance on how to evaluate the contribution of u^2 in
the second model.
Halitsky
12/7/12
Read Please ignore my first question about "estimated standard errpr" in
my last post !!!! Sorry !
Halitsky
12/7/12
Read The u^2 coefficient in c on (e,u,u*e,u^2) does NOT distinguish among
the four subset x MoSS roll-ups
Halitsky
12/7/12
Read Sorry! Those were the SE's in my last post, not the t's !
Halitsky
12/7/12
Read SE's and p's for four subset x MoSS roll-ups of u*e coefficient in c
= (u,e,u*e)
Halitsky
12/7/12
Read Re: SE's and p's for four subset x MoSS roll-ups of u*e coefficient
in c = (u,e,u*e)
Ray Koopman
12/7/12
Read I'm sorry Ray - excitement (probably unwarranted) has disconnected my
brain from my fingers ...
Halitsky
12/7/12
Read Must we say S,N instead of N,S if we've said "Subset x MoSS" (not
MoSS x Subset) ???
Halitsky
12/7/12
Read Re: Must we say S,N instead of N,S if we've said "Subset x MoSS" (not
MoSS x Subset) ???
Ray Koopman
12/7/12
Read Response to your last
Halitsky
12/8/12
Read Re: Response to your last
Ray Koopman
12/8/12
Read Re: Response to your last
Ray Koopman
12/8/12
Read I think I understand; if so, then here’s what I ex
pect you’ll agree I should do next
Halitsky
12/9/12
Read Thanks so much for the sample picture you sent off-line
Halitsky
12/8/12
Read One other thing - because we're using "c-average", not "c-simple",
"c" is no longer a pure count
Halitsky
12/8/12
Read One other possibly worthwhile observation regarding the term u*e in
the regression c on (e,u,u^e,u^2)
Halitsky
12/8/12
Read Typo's of u^e for u*e in previous post.
Halitsky
12/9/12
Read Could I impose on you for four more ordered p “ref
erence plots”?
Halitsky
12/9/12
Read Have sent off-line a PDF of the four plots themselves graphed all together.
gimpeltf@hotmail.com
12/9/12
Read I'm getting the hang of the plotting now - see PDF SNa1_1_for_Rubq
sent offline
Halitsky
12/9/12
Read Am resending the last PDF sent off-line, since I've now learned how
to highlight the line of interest against the random backdrop.
Halitsky
12/10/12
Read Re: Am resending the last PDF sent off-line, since I've now learned
how to highlight the line of interest against the random backdrop.
Ray Koopman
12/10/12
Read 1) Just u*e and u^2(!!); 2) IOTs vs “proper” tes
ts
Halitsky
12/10/12
Read Re: 1) Just u*e and u^2(!!); 2) IOTs vs “proper”
tests
Ray Koopman
12/10/12
Read Response to your last re Q and p
Halitsky
12/10/12
Read Sorry! I meant set=2, not set =1 in last post ...
Halitsky
12/11/12
Read Re: Response to your last re Q and p
Ray Koopman
12/11/12
Read 1) yes - I am using abs(t); 2) subtraction from 1
Halitsky
12/10/12
Read Results of p's obtained by referring Q’s to the ch
i-square distribution.
Halitsky
12/11/12
Read Correction to harmless "thought-typo" in last post
Halitsky
12/11/12
Read Another way to bring the other folds in might be via investigation of
your average slopes and covar vis a vis "hetness"
Halitsky
12/11/12
Read Re: Results of p's obtained by referring Q’s to th
e chi-square distribution.
Ray Koopman
12/11/12
Read OK then, how ‘bout “hetness”? Are you amenabl
e to its further investigation?
Halitsky
12/12/12
Read Re: OK then, how ‘bout “hetness”? Are you amen
able to its further investigation?
Ray Koopman
12/12/12
Read I need to correct an apparent miscommunication regar
ding derivation of het H’s and L’s
Halitsky
12/13/12
Read Re: I need to correct an apparent miscommunication r
egarding derivation of het H’s and L’s
Ray Koopman
12/13/12
Read The SE's are in the zipped files but here they are for your
convenience ....
Halitsky
12/13/12
Read Re: The SE's are in the zipped files but here they are for your
convenience ....
Ray Koopman
12/13/12
Read Re your question about "linearity of SE’s in lengt
h"
Halitsky
12/14/12
Read Re: Re your question about "linearity of SE’s in l
ength"
Ray Koopman
12/14/12
Read Your question re features of (L,Aubqe) plots
Halitsky
12/13/12
Read I think I may have found something relevant to Aubqe
“het-ness” and heteroscedasticity
Halitsky
12/13/12
Read Re: I think I may have found something relevant to A
ubqe “het-ness” and heteroscedasticity
Ray Koopman
12/14/12
Read Re your questions about the plots sent off-line (and the underlying
data posted here 12/13 at 10:33am)
Halitsky
12/14/12
Read Re: Re your questions about the plots sent off-line (and the
underlying data posted here 12/13 at 10:33am)
Ray Koopman
12/14/12
Read Thanks for the terminological/methodological corrections, and also
for the ref to gnuplot.
Halitsky
12/14/12
Read Re: Thanks for the terminological/methodological corrections, and
also for the ref to gnuplot.
Ray Koopman
12/14/12
Read Response to your last of 12/14 at 227pm re terminology and methodology.
Halitsky
12/14/12
Read Re linearity of the Axxxx SE plots – hold on to yo
ur hat
Halitsky
12/14/12
Read Re: Re linearity of the Axxxx SE plots – hold on t
o your hat
Ray Koopman
12/14/12
Read Thanks for doing those two plots - yes - we agree on what we're seeing
Halitsky
12/14/12
Read Re: Thanks for doing those two plots - yes - we agree on what we're seeing
Ray Koopman
12/15/12
Read Re: Thanks for doing those two plots - yes - we agree on what we're seeing
Ray Koopman
12/15/12
Read Re plot of SEP against L
Halitsky
12/15/12
Read Effect of multiplying SE by sqrt(N), as per your post of 12/14 at 10:34pm
Halitsky
12/15/12
Read Re: Effect of multiplying SE by sqrt(N), as per your post of 12/14 at 10:34pm
Ray Koopman
12/14/12
Read One other general question regarding scaling to [0,1].
Halitsky
12/14/12
Read Re: One other general question regarding scaling to [0,1].
Ray Koopman
12/14/12
Read Sorry - I will be typographically more careful re Aubqe in the future.
Halitsky
12/1/12
Read Re: Interpretation of coefficients in multiple regressions which
model linear dependence on an IV
Ray Koopman
12/1/12
Read Thanks for elucidation of 2nd new regression.
Halitsky
12/1/12
Read Re: Interpretation of coefficients in multiple regressions which
model linear dependence on an IV
Ray Koopman
12/1/12
Read Roger corrected defs; also, will add new cov, just in case it's
needed later
Halitsky
12/2/12
Read Re: Interpretation of coefficients in multiple regressions which
model linear dependence on an IV
Ray Koopman
12/2/12
Read 1) thanks for the 3rd regression defs; 2) Yes - I see why the terms
aren't "symmetrical" in this case.
Halitsky
12/3/12
Read New copies of a1_N_1_C and a1_N_1_S with data for all three
regressions at each singleton length.
Halitsky
12/3/12
Read Since 3rd regression computation needs df = 5, am requiring 15
observations for any given length singleton in any cell
Halitsky
12/3/12
Read Have sent off-line all N_1 regression coefficient files and master N
per length index file for N1
Halitsky
12/3/12
Read Same as above post for f_N_2_ss
Halitsky
12/3/12
Read Same as above post for f_N_3_ss
Halitsky
12/3/12
Read Same as above post for f_R_1_ss
Halitsky
12/3/12
Read Same as above post for f_R_2_ss
Halitsky
12/3/12
Read Same as above post for f_R_3_ss
Halitsky
12/3/12
Read Re: Since 3rd regression computation needs df = 5, am requiring 15
observations for any given length singleton in any cell
Ray Koopman
12/4/12
Read Sparseness of b1 data ...
Halitsky
12/4/12
Read I realized I should clarify my 4-way b1 match table: it's AFTER
subtracting df of 3
Halitsky
12/4/12
Read Re: I realized I should clarify my 4-way b1 match table: it's AFTER
subtracting df of 3
Ray Koopman
12/4/12
Read No - the counts in the files themselves are all OK.
Halitsky
12/4/12
Read Re: Sparseness of b1 data ...
Ray Koopman
12/5/12
Read We cross posted, so I just saw your revised "counts" table after I
made my last two posts ...
Halitsky
12/4/12
Read Let me know if you're ready for some interesting data, or if you're
too busy analyzing
Halitsky
12/4/12
Read Re: Let me know if you're ready for some interesting data, or if
you're too busy analyzing
Ray Koopman
12/4/12
Read Please evaluate this "yield" table of method/subset avg slope 2-ways
per fold and len with p < .05
Halitsky
12/5/12
Read One other question about using Auq avg slope as a constant when
computing the other two regressions
Halitsky
12/5/12
Read Re: One other question about using Auq avg slope as a constant when
computing the other two regressions
Ray Koopman
12/5/12
Read Re: One other question about using Auq avg slope as a constant when
computing the other two regressions
Halitsky
12/4/12
Read Some of your counts apparently ARE off.
Halitsky
12/4/12
Read Sorry! those counts in my last post were for len 63 in b1 (forgot to
tell you the length!!!!)
Halitsky
12/4/12
Read Re: Since 3rd regression computation needs df = 5, am requiring 15
observations for any given length singleton in any cell
Ray Koopman

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.