My son is more interested in NFL football than I am, and we had a recent discussion concerning the probabilities of certain teams being seeded in the playoffs. I first offered him a solution based on the simplistic notion that a good team will win in proportion to its season win percentage, but my son objected, because that didn't take into consideration how good or bad the opponent was. So I undertook to model the win/loss percentages for all of the NFL this year, and I offer the results to all of you.
I created a model based only on which teams played which teams, and whether the result was a win, loss, or draw. I didn't take into consideration the points scored, whether the game was home or away, whether there were injuries, etc. Also, each team was modeled as having an ability which remained constant over the year.
I decided to rate each team with a single number, such that the probability that a team rated "r1" has a probability beating a team rated "r2" is given by:
p = CDF[NormalDistribution, r1-r2];
I then used "FindMaximum" to find the set of ratings that maximizes the log-likelihood of the observed win/loss/tie results observed through the season. (Mathematica experts: is there a better way of doing this, perhaps using a builtin regression tool?)
I only wrote this last week, so I built into it the ability to select only a portion of the season's results so see how it would have performed historically. As more results entered into the model, its predictive power has grown to be pretty good. In the last four weeks, it has scored 11-5, 10-6, 12-4 and 12-4 in predicting the winners of games. In particular, I have identified 14 games in the last four weeks where the betting public seemed to be supporting the "wrong" team, and this method predicted the winner in 11 of those games. (I also used the model to estimated an expected value on the numbers of wins, and I have to admit that it's been lucky the last four weeks.)
So I offer the below code for educational purposes, no warranty implied or expressed, your mileage may vary.
(* I wrote this code and hereby place it in the public domain. Scott Hemphill 24 December 2012 *)
(* Warning: If executed, this package will write a file called "matrix.m" which contains a 32x32 matrix containing the probabilities for each team beating each of the others, as rounded integer percentages. I edit this into a PostScript source which generates a pretty table. *)