| Gaming the Rankings

This post is meant to be read in conjunction with Lou Burruss’ post explaining the underlying principles of the rankings. Without that, some of the terminology I use may seem foreign.

In an attempt to understand the rankings better, I analyzed the 2011 rankings of five teams in order to help my alma mater SUNY-Buffalo understand how they should plan their regular season. The questions I asked were: should we play highly ranked teams? Should we go up against low ranked teams and crush? What would help us in the grand scheme of things?

The biggest story involving the rankings in 2011 was an abnormal result of a game between Harvard and Tufts. Tufts came away with a 15-9 win in the finals of One Nightstand. It was speculated that the win propelled Tufts into the top 20 rankings and gave New England a strength bid. Tufts ended up 15th in the rankings, took that second bid, and finished 10th at Nationals. This may have been a bigger story had Tufts performed poorly in Boulder, but throwing games to gain extra regional bids is still a concern under the current rankings system.

Let’s get right into things. Tufts played in 4 sanctioned tournaments in the 2011 regular season: Stanford Invite, College Southerns, Terminus, and One Nightstand. I compiled their games, found their opponents rating at the end of the regular season, calculated the score differential components and the relative weights due to the date the game was played. With all game ratings calculated (for instance, Tufts beating Harvard 15-9 resulted in a game rating of 2240, and Tufts losing to Carleton resulted in a rating of 1567) I calculated the total ranking by dividing the sum of the weighted game ratings by the sum of the weights.My calculations came out with a rating of 1702 for Tufts, which is 4 points off to their official rating of 1698.The difference is due to the fact that some teams did not have a reported regular season ranking due to not playing 10 sanctioned games. USAU still calculates a ranking for those games, but my rating was close enough for this discussion.

For the discussion in regards to strength bids, the number that mattered in 2011 was 1627, or the “Michigan State” line. Michigan State was the 20th ranked team in the nation after the regular season, but because of an auto bid given to those pesky teams in the Metro East, they were the last team out. Any time Tufts achieved a game rating of >1627, they were moving towards a bid. Anytime they got below that, they were doing themselves a disservice.

There is a common assumption that playing a harder schedule is better for teams. After all, you play highly ranked opponents and thus losing games doesn’t hurt you as much. Play Carleton 10 times in 2011 and lose 15-0 each time? You achieve a ranking of 1567, which is very close to achieving a strength bid. So yes, the loss to Carleton hurts, but it doesn’t hurt that much. I don’t have enough data to claim an all knowing answer for teams, but it is possible to play “Captain Hindsight”. Thus, the assumption would be that Stanford Invite would be the best tournament for Tufts to have attended in 2011.

Was Stanford Invite the best tournament for Tufts? No.

The ratings Tuft’s achieved for each individual tournament

Stanford Invite – 1572 – #24

Georgia Southerns -1746 – #14

Terminus – 1771 -#13

One Night Stand – 1764 – #14

What? Tufts performed their worst at Stanford Invite?

Tufts lost to Stanford, SDSU, UCSC, Pittsburgh and Carleton. Besides the Stanford loss, those aren’t really terrible results for a team looking to be on the up and up. They did not perform well and that could be due to a multiple reasons. First tournament of the season, long travel time, etc. They lost games by large amounts to good opponents and won games by small amounts against solid opponents. 13-11 victory over Texas, 13-11 victory of UCSB. Yet their best tournaments were at the end of the year where they only faced one more top 20 team.

Why were College Southerns, Terminus and One Nightstand so beneficial for Tufts?

Tufts steamrolled over mediocre opponents at Georgia Southerns. They had only two victories that did not achieve the maximum score, 606 pts,which were their 13-8 and 13-10 victories over Notre Dame and Central Florida. At Terminus, Tufts continued to either beat down teams ranked between #50 -#100, or beat solid teams by slim margins at Terminus and One Night Stand. The only exception in this is the controversial Harvard game, in which Tufts achieved their best game score of the year of 2240.

As we can see from the rankings from the tournaments, Tufts was above the Michigan State line in every one except for the Stanford Invite. In order for Tufts to be below that line, they would have to be ranked 70 points lower. Is it possible that the Harvard game could provide the necessary swing?

Did the 15-9 victory over Harvard put Tufts into the Top 20?
The safe answer? Maybe. The likely answer, no.

Unfortunately, without the entire model, I cannot simulate the entire outcome of any other situation. Results cascade across the model meaning that if Tufts’s rankings would have gotten worse, the opponents they played would also get worse. This could mean something in the order of a few points, or if the match-ups were right, it could make a big difference. However, there is some analysis available that gives us a likely answer.

If Tufts loses that game against Harvard 15-0, their ranking gets adjusted by 56 points. So in that worst case scenario (not changing every other team’s rating), Tufts falls from 1698 to 1642, or from #15 to #18. If we mimic a scenario where Tufts is going all out with nothing to lose, the game is likely to emulate the conference final or regional final. Tufts lost to Harvard 15-9 at Conferences and then 11-8 at Regionals.If we replace the One Nightstand score with those two scores, Tufts’ final rating is 1648 and 1662 respectively, which is above the Michigan State line.

While we cannot accurately replicate the model without more work, there are some educated guesses available. If Tufts falls by 50 points in the ranking, then we can only simply make a quick estimation. Thus in the games that Tufts played, their opponents would get 50 points less for that game rating. Assuming a 20 game schedule for each of their opponents without considering date weights. 50 points/20 games means that their opponents rating would be lowered by about 2.5 points. Thus, Tufts’ rating would be lowered by 2.5 points per game, which would just average out for Tuft’s rating being 2.5 points lower. On the other hand, we also have to cascade Harvard’s increased rating across the board which could have wider implications.

The point you want to know? Tufts was probably already in the Top 20, as they were on the cusp before the Harvard win.
What does this say about optimizing a schedule based on strength of opponent?
This continues to challenge the assumption that it’s best for teams to continue to play in hard tournaments. Below is a figure displaying Tuft’s game ratings vs. opponents strength.

The red lines distribute the graph into three tiers of opponents. Below #125, between #125 and #24, and above #24. The purple line represents the Michigan State line, as Tufts wanted any game they played to be above this line to count towards a strength bid. Again, should Tufts have played a more difficult schedule?

If we see Tufts’s ratings in these three sections, they are 1594 (because they played teams below #125 and achieved maximum score) in the first, 1736 in the second and 1751 in the third. So playing tougher teams worked out for Tufts since they had their best rating against the top tier of teams or basically the “top 20” in the nation… Maybe.
The outlier in the third is the questionable Harvard game, and if we take that away, we get a rating against the top #23 of a solid 1571. I think this shows that Tufts was best served by laying down the hammer on teams between #125 and #24. They demolished mid tier teams, while most of the time, they faltered against the top tier. That Harvard win skews the results that much in this case.

Now the question that can be asked in hindsight, what type of schedule should Tufts had played? Out of the 6 games they played against top 20 competition, 2 were above the Michigan State line. Out of 17 played against teams between #125 and #24, 2 were below the Michigan State line. Tufts had a higher probability of gaining a higher rating per game by playing in the #125 to #24 range.

I don’t know what this means in terms of correcting rankings, but I do that if I were a coach I would be looking at tournaments like One Nightstand, New England Open, Terminus, Free State Classic, College Southerns, and even this weekend’s Tally Classic (Take note of how Georgia’s and Wilmington’s rankings change next week) in order to boost my rating. For teams like Carleton and Wisconsin this means nothing, they are going to get a strength bid most years without worry. For a team that is in dire need of a second or third bid, this could mean the world.

This is not to say that teams should not attend hard tournaments. That is where you cut your teeth on hard competition that makes you better in the long run. However, it is simple to say that going to better tournaments will not always reward you in the rankings. Winning games and score differential play such a huge role.

It is possible to maximize a schedule based on the score differential component in the rankings and knowing your past score differentials vs. past opponents. This is an optimization problem that there is literature on, so for those of you math majors out there, go to town.

Tomorrow I will be discussing that you can choose a schedule that makes it impossible to get a strength bid (Penn State), teams that play worse against lower opponents and better against higher opponents (Georgia Tech), and the last team out (Michigan State).

Photo by Andrew Davis.