Sunday, November 15, 2015

SORTACUS Football Rankings (And Math!)

There are only a couple weeks left in the college football season, and the race for the playoffs is heating up!  Debate and discussion abound left and right over which teams deserve to reach the playoffs.  Which conferences are the strongest?  Which wins are most impressive?  When can a one loss team pass an undefeated?  Fortunately SORTACUS is here give us an answer.


Back when K,TMHMST was more active it was as much about math as sports.  Probability, game theory, and variance were regularly discussed.  I've given an overview of SORTACUS:  it uses only wins and losses to rank which teams are the most deserving.  I have not discussed much of the internal workings however.  Until now.  Break out your calculators, it's math time!

SORTACUS uses Bayesian statistics to estimate the inherent ability of each team (using only wins and losses).  Each team starts of with a wide normal distribution of ratings estimates as a prior distribution.  Each game then have a likelihood function of a win or loss: it's more likely you have a higher rating if you win.  The likelihood function is the normal cumulative distribution function.  Unfortunately these are not conjugate priors, so there's no pretty mathematical solution.  Instead I have a series of ratings bins (right now 401) that each have a probability and together approximate the distribution.

Whew, that's a lot of words.  Let's try an example instead.

Say Iowa, before playing Minnesota, has three ratings probabilities:  bad, average, and good.  Let's say they have the following odds:
Odds Iowa's rating is... Bad:  10%, Average: 50%, Good:  40%

That's our prior distribution (sort of).  We also know the odds that Iowa beats Minnesota if Iowa's rating is any of those categories.  These odds depend significantly on Minnesota's ratings.
Odds Iowa beats Minnesota if its ratings is... Bad:  20%, Average: 50%, Good: 80%

Now we need the observed data.  Remembering back to an unnecessarily stressful Saturday night we know Iowa did beat Minnesota.  We can apply Bayes's formula (P(x|y) = P(y)*P(y|x)) to calculate the probabilities of Iowa's ratings given a victory over Minnesota: 
Odds after winning that Iowa's rating is... Bad:  2% [10%*20%], Average:  25%, Good:  32%

Don't forget to normalize!
Odds after winning that Iowa's rating is... Bad:  3.4%, Average:  42.4%, Good:  54.2%

That's how SORTACUS works.  It does this process for 401 bins (not 3) for 128 teams for every game they play.  It also does multiple iterations to help converge towards the true value, each iteration having more accurate calculations of opponent quality.  The program takes ~5 minutes to crunch the numbers, and then spits out the college football rankings.

Speaking of rankings, let's get to the part everybody's interested in.

Top 25

The non-mathematical readers likely just jumped straight to these ratings.  Below is the SORTACUS top 25 using games through Saturday November 14.  In theory, the top 4 teams are the most worthy of a playoff spot so far.  Note that only FBS vs FBS games are considered (I see you Washington State), and that "Rating" corresponds roughly to how many points better than the FBS average a school has been.

Rank Team Record Rating
1  Clemson 9-0 23.52
2  Iowa 9-0 22.87
3  Oklahoma St 9-0 21.62
4  Alabama 9-1 21.58
5  Ohio St 10-0 21.20
6  Notre Dame 9-1 20.81
7  Houston 9-0 20.22
8  Florida 9-1 18.66
9  Navy 7-1 16.80
10  TCU 8-1 16.75
11  Oklahoma 9-1 16.41
12  Michigan St 9-1 16.35
13  LSU 7-2 15.32
14  Northwestern 7-2 15.32
15  Memphis 7-2 15.27
16  Stanford 8-2 14.73
17  Baylor 7-1 14.07
18  Michigan 8-2 14.06
19  Utah 8-2 13.33
20  North Carolina 7-1 12.72
21  Washington St 7-2 12.35
22  Toledo 8-1 12.24
23  Wisconsin 8-2 12.03
24  Mississippi 6-3 11.91
25  USC 7-3 11.16

The Playoffs

SORTACUS is making me look like a homer this year, but I promise it's just the numbers.  Iowa's victories over Northwestern and Wisconsin continue to get more impressive, and SORTACUS sees the Hawks as the nation's second most accomplished team this season.

Elsewhere in the playoffs Clemson stands out as a clear number 1.  A weak ACC is offset by an impressive victory over Notre Dame.  Number 3 Oklahoma State is the Big 12's last unbeaten, and they face a tough road in their last two games.  For the final slot Alabama jumps over defending champion Ohio State even with a loss.  Ohio State could improve that with a couple late wins, but if any undefeated team loses the Tide and Notre Dame are waiting at the door.

Further down in the standings it's interesting to notice the weakness of an ACC schedule (7-1 North Carolina at 20) compared to the surprising strength of an AAC schedule (7-2 Memphis at 15).

How Is My Team Doing

For the unlucky few who don't love the Hawkeyes (or Cardinal) here's a few notable team rankings.
Texas Tech:  44th, rating of 4.32
Auburn:  51, 3.38
Nebraska:  59, 1.22
NC State:  65, 0.17
Vanderbilt:  70, -1.32
Iowa State:  79, -3.80
Maryland:  89, -5.97
Purdue:  98, -10.32