Why this was done:
The Misix Analytics Department decided to undertake the daunting task of creating a model by which to rank college basketball teams as a result of one employee’s anger with the Ratings Percentage Index (RPI) being used by most of the mainstream media. His constant complaining about the RPI led one of his friends to tell him, “If you think you can do it better, build one yourself. Otherwise shut up!!!” It was at that moment development of the Misix College Basketball Rankings (MCBR) began. The current method of calculating the RPI, as developed by the NCAA, is as follows:
While we agree that this provides a good estimate for ranking college basketball teams, we feel it is missing some key components that would give a more accurate ranking for each team. Given how heavily the media relies on the RPI for discussion purposes and how influential it can be financially in big time college basketball, we thought we would introduce something we feel is more accurate to better arm the pundits come tournament time.
The Overall Formula:
In the MCBR, a team’s calculated score is based on of the following variables:
The basic formula looks like this:
Scoret = f(Margin,Location,OpponentScoret-1,Momentum)
The Logic Behind the MCBR:
Unlike the RPI, the MCBR incorporates margin of victory/loss. There are some that contest the use of this variable and argue that a win/loss is a win/loss no matter how you look at it. Winning/losing matters more than by how much the game is won/lost by, but a game that goes down to the wire and is decided by one point is not the same as a game where the team gets blown out by fifty. We have created six categories to account for margin of victory/loss by first finding the standard deviation of the absolute value of the point differentials from the 2010 – 2011 season. Then, starting from zero, we added 1/3 the standard deviation to get our categories. The following are the categories for margin of victory/loss:
It is our belief that these relationships should hold in future years due to using margin of victory/loss. We will recalculate this periodically to make sure our assumption is correct and adjust accordingly if it is not.
As with the RPI, we simultaneously take into account the location of the game; whether it was played at home, away or on a neutral court. However, we have set up our model a little different: we created a system of points that a team can earn for each game they play. A 20 point victory at home is not worth as much as a 20 point victory on a neutral court which is not worth as much as a 20 point victory on the opposing team’s home court. For those critics that feel all wins, regardless of margin of victory, strength of the opponent, or location, should be worth more than losses, we control for this too. Wins have a set of 18 point levels (6 point categories for each game location) and losses have a separate set of 18 point levels. No loss is worth more than any victory; in fact, the victory with the least amount of points has 3.5 times more points assigned to it than the loss with the most points.
The opponent’s calculated score is another component to the MCBR. Unlike the RPI, which incorporates the opponents’ and the opponents’ opponents’ winning percentage separate from the team’s win/loss performance, we calculate this simultaneously. Outside of the first game for each team, the opposing team’s calculated score is factored into each game’s point assignments. For example, a 1 point loss at Kansas is worth more for a team than a 1 point loss at Houston Baptist. As the season moves forward, each opponent’s calculated score is constantly updated. For example, if Marquette beats a team with a calculated score of 17 at the beginning of the year, but that team finishes the season with a calculated score of 27, Marquette’s calculated score at the end of the year will be calculated with the 27 score, and not the 17.
The overall SOS for a team is calculated by averaging the calculated scores of each of the team’s opponents to date. As a result of how we break things down, we are able to provide a SOS for the following:
We will make these public in the hopes that the overall ranking is better understood and to spark some interesting discussion.
Momentum is the final component of the MCBR. Again, this is one those variables that gets discussed at tournament time. Should a team’s recent performance be factored into the decision to include/exclude a team from the tournament, or should it be decided strictly on their cumulative body of work? It is our belief that some weight should be given to a team that is playing well over their last 12 games. For each win out of a team’s last 12, a scaled number of points is added to their final score for the week. In simulations, this metric was only a factor for teams with scores that were very close, which is what we want; sort of a statistical tie breaker.
While the RPI, as calculated by the NCAA, is good at estimating a general ranking for college basketball teams, there are some key components we feel it does not address. In the MCBR, we feel we have captured these components allowing for a more accurate ranking of all teams. With a more accurate ranking, the media will be able to discuss teams more intelligently and hopefully better guide discussions of team inclusion/exclusion into the NCAA Tournament at the end of the season.
Adjusting the MCBR
Why this was done:
After reviewing the original ranking model, we determined there was a portion of a team’s performance that was being unintentionally excluded from the calculation. The original MCBR was built to rank a team based on the outcome of the game, the quality of the opponent and the location of the game. The missing component was the quality of play each team exhibited during the game. We believe this adjustment corrects the oversight.
How quality of play is estimated:
The reader should know that the ideas used for the quality-of-play estimation come from Ken Pomeroy and Dean Oliver. In many circles, Pomeroy is regarded as one of the top college basketball data analysts, and his work in advanced statistics has been discussed by ESPN and in publications such as the Wall Street Journal. The Misix quality-of-play estimation takes the ideas discussed by Pomeroy and tweaks them using regression analysis rather than then the Pythagorean calculation for expected winning percentage. We admire Pomeroy’s work, which is why it is the basis for our estimation, and we encourage our readers to follow Pomeroy and his analyses.
The overall formula:
The Misix quality-of-play estimation is a function of a model that forecasts a team’s winning percentage based on specific offensive and defensive statistics. The basic model structure, which comes from Oliver, is:
Each team’s results are then put in order and indexed using the team with the highest value as 100.
The logic behind the formula:
As discussed by Oliver and Pomeroy, four factors measure how good a team is when they have the ball and how well they defend when they don’t have the ball. The regression model we’re using estimates a team’s winning percentage based on its performance in each offensive and defensive statistic. Please note that we are using FTM/100 rather than free-throw rate Here are the definitions of each variable used in the model:
- Effective field-goal percentage – basically the same as regular field-goal percentage, except three-pointers get weighted heavier (50%) than two-pointers (because three is 50% more than two).
Effg% = (0.5*3FGM + 2FGM)/FGA
- Turnover percentage
TO% = (Turnovers/Possessions)
- Offensive rebounding percentage
OR% = (OR/(OR+DRopponent)
- Free-throw makes per 100 possessions
FTM/100 = (FTM/Possessions)
The final rankings combine the results of the original MCBR Index values with this new quality-of-play estimated index. We feel the combination of these two indices gives a more accurate ranking of a team’s quality of play and quality of opponent.
For further clarification and any questions please contact Andy Martinelli at firstname.lastname@example.org.