|
|
Re: distribution of regression coefficients
Posted:
Nov 11, 2010 5:15 PM
|
|
On Nov 11, 12:12 pm, "Rod" <rodrodrod...@hotmail.com> wrote: > "Paul" <paul_ru...@att.net> wrote in message > > news:34c15dfb-a383-43ee-b01e-b7750d2f1cd4@u11g2000prn.googlegroups.com... > On Nov 11, 5:45 am, "Rod" <rodrodrod...@hotmail.com> wrote: > > > > > "Ray Koopman" <koop...@sfu.ca> wrote in message > > >news:cc3b56b7-ce12-4f8c-8a9a-c5f2592f6e8b@n24g2000prj.googlegroups.com... > > > > On Nov 10, 4:07 am, "Rod" <rodrodrod...@hotmail.com> wrote: > > >> On Nov 10, 2:20 am, Ray Koopman <koop...@sfu.ca> wrote: > > >>> On Nov 10, 12:44 am, "Rod" <rodrodrod...@hotmail.com> wrote: > > > >>>> In regression y = a + b*x > > > >>>> I know how to compute the covariance matrix for a and b. > > >>>> I also know that a and b are normally distributed, > > >>>> but what is the joint distribution of a and b? > > >>>> Its tempting to guess bivariate normal > > >>>> but I don't see how to show that. > > > >>> y = X beta + e > > > >>> W = (X'X)^-1 X' > > > >>> b = Wy > > >>> = beta + We > > > >>> If e is multivariate normal then so is We, and hence b. > > > >> ditto a, but what of the joint distribution given that a and b are > > >> correlated? > > > > Sorry, I should have been more explicit (and used only low-ascii > > > characters). In what I wrote, X is a given n by p matrix > > > of predictors, where n is the # of cases and p is the # of > > > predictors, beta is a p-vector of unknown coefficients, and e is > > > a random n-vector. If one of the columns of X is a dummy predictor > > > whose value is 1 for every case then the corresponding element in > > > beta is the intercept (your a ). So the intercept is "just another > > > coefficient". > > > > Whatever the distribution of e may be, if its mean vector and > > > covariance matrix are m and S then the mean vector and covariance > > > matrix of b are beta + Wm and WSW'. (Note: we usually assume > > > m = [0,...,0]'.) > > > > If e is multivariate normal then b is also multivariate normal. > > > I rather hastily assumed my b was the same as yours, sorry. > > OK I get it that if the e are normal then b is just a linear combination > > and > > hence also normal. > > Either I am not understanding what you are saying (likely), or you haven't > > yet answered my question fully. > > It's the former. > > > To keep it simple lets keep the e normal and independent from each other. > > Also let me return to my y=a+bx notation. > > I am after the joint probability P(a,b) which because a and b are > > correlated > > is different to the product of the two distributions for a and b > > separately. > > I would put money on P being bivariate normal but for the life of me I > > can't > > see how to work that out. > > As Ray said, b = \beta + We, where W is computed from the X matrix. > The theoretical variance-covariance matrix of b is E[(b-\beta)(b- > \beta)'] = E[Wee'W']. Treat X, and therefore W, as constant with > respect to the expectation. Since the e are assumed i.i.d. with zero > mean and variance sigma^2, E[ee'] should be obvious (left to the > reader as an exercise). > > /Paul > > Maybe I'm asking a silly question, or at least not communicating it > correctly. But I am after the functional form of the joint probability of a > and b or b_0 and b_k if you prefer. > > Thanks all for your contributions. > > Rod
You need to write out in detail those formulas that people have already given to you in general terms.
Let the true model be yi = a + b*xi + ei, where the ei are iid with distribution N(0,s^2); I really should use alpha and beta instead of a and b, but that makes reading and writing harder. Instead, use small letters for the unknown true values and capital letters for the regression estimates.
The fitted equation is y = A + B*x, where A = [Sx^2 * Sy - Sx * Sxy]/D and B = [n*Sxy - Sx * Sy]/D, with D = n* Sx^2 - (Sx)^2. Here, Sw means the sum of w_i for i = 1,...,n, so Sx^2 = sum(xi^2), etc. Plugging in the expression yi = a + b*xi + ei, i=1,...,n, we get: A = a + Ea, B = b + Eb, where Ea = [Sx^2 * Se - Sx * Sxe]/D and Eb = [n * Sxe - Sx * Se]/D. This exhibits A-a and B-b as explicit linear combinations of e1,...,en (with constant coefficients), so A and B are automatically bivariate normal. In fact, if we write Ea = sum(ui * ei,i=1..n) and Eb = sum(vi * ei, i=1..n), then the bivariate generating function of (A,B) is E exp(r*A + s*B) = E exp(sum(r*ui + s*vi)*ei,i=1..n)) = product([E exp(r*ui+s*vi)*ei)],i=1..n) = product(exp((r*ui+s*vi)^2 * sigma^2/2),i=1..n) = exp(sum(...),i=1..n). You can write the exponent as (1/2)[U*r^2 + 2*W*r*s + V*s^2], so Var(A) = U, Var(B) = V and cov(A,B) = W. (A,B) had bivariate normal distribution with mean = (a,b) and variance-covariance matrix [[U,W],[W,V]], where this means matrix = [row1, row2]. You have everything you need to write out U, V and W in detail.
R.G. Vickson
|
|