"Aino" <aino.tietavainen@removeThis.helsinki.fi> wrote in message <email@example.com>... > Hi all. > > I have a simple linear regression with x and y data. Now, if I take a sample, say (x1, y1), how do I get some probability that the sample belongs to the regressed data at hand? > > In another words, it is possible (somehow..) to get for example 95% prediction bounds/intervals to the regressed data, but how do I do the opposite, how do I get the "percentage" for a certain (x1, y1)? > > The bigger picture (for those who are interested): I have two sets of data and two regression lines, and I have to decide to which data set the sample belongs to. Linear discriminant analysis is not an option here, but anything "ANCOVA with unequal slopes" would be interesting. >
So given a linear regression, you can compute an uncertainty around the line at any point x. This would be in the form of a normal distribution, with mean at the predicted value of the line, and a variance around that point in y. The variance will be largest near the ends of the line of course.
So given that (x,y) pair, you will have a normal distribution. Use the normal CDF to convert that to a probability score. You will get different probabilities for each line of course, so the line with the better score "wins".
A quick search online shows at least a few sites site with sufficient information provided to do the computations, here: