The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* » sci.stat.math

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Multiple regression with all dummy variables
Replies: 7   Last Post: Feb 15, 2013 4:17 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]

Posts: 2
Registered: 12/11/12
Multiple regression with all dummy variables
Posted: Dec 11, 2012 1:20 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Does a multiple regression with all dummy (indicator) variables make
sense? I work at a state university tutoring various basic subjects
including college algebra, first semester calculus, and a two-semester
"Statistics for Business and Economics" sequence. In recent years my
students have been taught that an alternative to using the ANOVA
technique is to run a multiple regression analysis using all dummy
variables. A recent example given as a study guide for the final exam
was a comparison of used-car prices by color (white, black, blue, or
silver.) Both ANOVA and a multiple regression (with black as the
excluded category) reject the null hypothesis that there is no
difference in prices by color. But the students are then told that the
multiple regression gives more information since we can conclude from
the t-tests on individual coefficients that silver cars sell for more
than the base case (black.) I thought you needed at least one measured
(scalar?) variable among the explanatory variables -- it makes no
sense to do a scatter plot on just a dummy variable, so what on earth
is this "line" (or surface) you are getting from the regression?

So, is having at least one measured explanatory variable a basic
requirement for regression? Has anyone proven that the individual
coefficients on an all-dummy variable regression have no meaning?
Perhaps they follow a well-defined distribution, which might not be
Student's t. Any easy on-line sources? I did not see anything in basic
article on regression in wikipedia.

I'll mention that previously students were taught that, according to
the Central Limit Theorem, if you are doing hypothesis testing on a
mean and you have more than 30 or 40 data points, it's OK to assume
your test statistic is normally rather than t-distributed. They've
abandoned that nonsense, but I'm sceptical about these all-dummy

Thanks for any help!

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.