Here's what you need to know about the USA Ultimate rankings, bid allocation, and the algorithm.
April 1, 2015 by Charlie Eisenhood in Analysis with 50 comments
By design, the USA Ultimate ranking algorithm is a bit of a black box: scores go in, rankings come out. If you are interested in the mathematical procedure, you can check out the gritty algorithm details. However, most people don’t want to take the time to understand that in its entirety, so we’ve put together a list of frequently asked questions about the USA Ultimate rankings and algorithm to help players and fans understand exactly what’s going on.
How does an individual game weigh into the rankings?
When two teams face off, it’s not just about which team wins and which team loses. As with most good sports rankings computer models, the algorithm takes into account the relative strengths of the teams and the margin of victory. Therefore, a 12-11 win isn’t worth as much to the winner as a 12-8 win. Also, a close loss to a good team could help the loser’s rating, whereas a close win over a bad team could hurt the winner’s rating.
Games also decay in value over time, so more recent results are weighted more heavily.
For a more in-depth discussion of the value of individual games, see this article from Ultiworld rankings expert Scott Dunham.
How are the bids to the College Championships determined?
There are ten regions in the country; teams that qualify for Regionals will have an opportunity to play for a berth at the College Championships. Right away, one automatic bid to Nationals is assigned to each region, regardless of the outcome of the regular season. Think of it like college basketball — every conference (even the smallest, least competitive ones) gets a berth in the NCAA tournament.
The final 10 bids (“strength bids”) are allocated based on the results of the regular season. This is where the rankings and the algorithm come into play. After the teams are ranked, you can look at the top 20 teams to figure out how many bids a region will get. Start from the top of the rankings: count down from the top and assign each team a bid for their region. The first team from a region “gets” the auto bid; the next gets a strength bid. Go through until all strength bids have been allocated.
For example, take the Ohio Valley Men’s region. Currently, Pittsburgh is ranked #1 and Cincinnati is ranked #3. Pitt “earns” the automatic bid for the OV, and Cincinnati captures a strength bid. Similarly, the Northwest Women’s region has five teams in the top 18; they get one auto bid and four strength bids for a total of five bids to Nationals.
Remember that strength bids may not be allocated to every team in the top 20, if there are regions going unrepresented in that top group. For example, Cornell is the top ranked Division I Men’s team from the Metro East region at #74. They “earn” the auto bid for the ME, which means that #20 in the Men’s rankings won’t be getting a strength bid.
This method of assigning bids rewards regions that have more strong teams during a given season. In the past, bids were assigned based on the previous year’s results, the size of the region (number of teams), and other considerations. That proved to be unpopular because it wasn’t responsive to how teams were playing during the regular season.
Are there problems with the current algorithm and ranking system?
As with any ranking system, USA Ultimate’s is not perfect. The algorithm relies on the “connectivity” between teams — you can only get valid rankings between two teams that haven’t played each other if they have a common opponent. Generally, this means that early season rankings are very strange, because many teams are isolated from one another in their results. By the end of the season most of those issues are worked out.
However, because teams are required to play just 10 games to be eligible in the rankings, sometimes the algorithm has difficulty placing a team accurately. The sample size isn’t large enough. That can lead to teams being either ranked too high or too low relative to the “eye test” of their results. This happens most years in college because of the large number of teams.
Because of the inherent uncertainty in the rankings with such a small number of games, there are many proponents of changes to the way bids are allocated. Currently, there is a bright line cutoff — make the top 20 (or 19, or 18, depending on auto bids) and get a strength bid for your region.
Many argue that the bright line cutoff should be removed and made into a “probabilistic” ranking that analyzes every region’s teams as a whole and assigns bids based on the overall likelihood that they have ‘X’ teams in the top 20.
One key concept to understand in the move to a probabilistic ranking system is the difference between a team’s “actual” rank and their “true” rank. Imagine USA Ultimate CEO Dr. Tom Crawford was omniscient and knew exactly how good each team was; he could then generate a set of “true” rankings of teams. Of course that’s impossible. So the actual rankings do a decent, but imperfect, job of predicting future performance; they do a decent, but imperfect job of creating a fictional “true” set of rankings.
A team ranked #1 in the current (actual) rankings is, most likely, the #1 team in the country. But there’s also a chance that the #2 team in the country is in fact the best team, and that the #1 team is in fact only the second best. Of course, it’s highly unlikely that the #1 team’s true talent would be much worse; below, say, #10.
When implemented, the probabilistic ranking would allocate bids based on “how likely” it is that there is a team within each region above the bid cutoff. A simplified example: consider a bid cutoff of 20, with a New England team ranked #20, a Southwest team ranked #21, a South Central team ranked #22, and another Southwest team ranked #23. It is more likely than not that the NE team is better than the SC team. And in isolation, it is more likely than not that the NE team is better than each of the SW teams. But the odds that the NE team is better than both of the SW teams is less than 50%.
To put it another way: It is likely that one of the bubble SW teams is in fact one of the top 20 teams in the country, and the rankings are just doing an imperfect job of sorting. Since the SW region is more likely to have one team in the top 20 than the NE region is, the probabilistic method would move a bid from the NE to the SW.
Are there other fixes that could improve the algorithm?
Yes, there are other solutions (though nothing is a magic bullet). You could require teams to play more games (say, 15) to be included in the rankings. But that would increase costs for teams and potentially be punitive to parts of the country where getting sanctioned games played is challenging due to weather or frequency of tournaments.
You could also include a human ranking element that is weighted with the computer algorithm to complete the final allocation. Downsides to this include the fact that few people actively watch enough games to be truly informed, video coverage is still spotty, and bias concerns.
Why is the Cincinnati Men’s team currently ranked #3?
There has been much discussion of Cincinnati over the past week, after the team elected not to participate in their final regular season tournament in order to protect their top 20 ranking. Because the rest of the teams at the tournament were ranked beneath Cincinnati, the team would mostly have faced downside by attending — close games or losses would have negatively affected their rating and could have jeopardized their ability to earn a bid for the Ohio Valley.
But how, then, did the team move from #14 to #3 without playing any games? There were two major effects.
First, Tulane — a team that Cincinnati beat twice earlier this season by substantial margins — had an outstanding weekend and won their final tournament, Huck Finn (the one where Cincinnati was going to play this past weekend). The positive effect on Tulane’s rating from winning the tournament buoyed Cincinnati via those early season wins.
Second, UC Santa Barbara — a team that Cincinnati lost to twice earlier this season — had those early season wins vacated because of academic eligibility issues. (Those issues are likely to be resolved before the finalized version of the rankings comes out later this week). Because Cincinnati’s punishing losses no longer counted against them, they vaulted into the top five. Their limited connectivity with other teams means that those Tulane victories are the most important determinant for their rating.
Have there been other controversies surrounding the rankings?
Yes. In 2012, the Whitman Men’s team forfeited games during consolation play in order to avoid playing games that might have negatively affected their ranking. They ultimately earned a strength bid for the Northwest region. USA Ultimate subsequently changed the rules surrounding forfeiting, only allowing teams to forfeit their final game of consolation if both teams agree to do so.
During the 2013 club season, Portland Rhino corrected the score of one of their losses from 13-6 to 13-7 during the review period after preliminary rankings had been announced. That correction was enough to boost their rating by three points, which pushed them in front of San Diego Streetgang at #16 in the rankings. It earned Rhino a bid for the Northwest and stole one away from the Southwest. The score correction was legitimate, but it again highlighted the vagaries of the bright line bid cutoff.
There is also frequent concern about intraregional matchups being gamed in order to maximize strength bid opportunities for the region. For example, if a highly ranked team from the Southeast plays another Southeast team near the bid cutoff, there is an incentive for the highly ranked team to intentionally play poorly, allowing the weaker team to earn ranking points and potentially move into the top 20.
If there are other questions you’d like to see answered, leave them in the comments. We’ll do our best to explain!