Ultiworld globe logo

Changing The Algorithm

by in Editors' Blog with 27 Comments

This week’s penultimate USA Ultimate college rankings are now posted. While for the most part the rankings look about right, there are some marginal discrepancies that range from odd to outright unfair.

Look at the cutoff of the Open Division at #18: Northern Iowa gets in to earn a fourth bid for the North Central. But they sit just nine points ahead of Texas A&M, a tiny advantage that could literally have been decided by one or two scores in a game.

Now, no disrespect to Northern Iowa, but who have they beaten? They got some strong wins at Huck Finn but they don’t have a single win — not even a single game! — against a top 25 team.

Texas A&M, on the other hand, has a win against USAU #8 Central Florida, and has played close against many of the country’s best teams. Their strength of schedule far exceeds UNI’s.

A&M has one bad loss — to Georgetown at Centex — but they beat them later in the weekend by a bigger margin.

Anyone looking at this situation would say that A&M deserves to be ranked higher than Northern Iowa. But not the ranking algorithm.

Now, you’re never going to have a perfect algorithm. Teams know at this point that wins matter, as do score differentials.

But it seems like we could reduce situations like this by changing the bid allocation process away from a purely mathematical computer approach to a blended approach like the Bowl Championship Series in college football.

There, a coaches poll, a sportswriter/player/expert poll, and a computer ranking get weighted equally to determine the BCS rankings. That means that human input accounts for two-thirds of the rankings, and data-driven algorithms determine the rest.

USA Ultimate’s rankings need some human input. I believe there is enough coverage of the sport at this point that journalists, coaches, and experts could put in ballots that would help balance out some of the marginal “mistakes” that the rankings seem to make.

I threw this idea out on Twitter and got mixed feedback.

John Cassidy said, “Yeah but football is on tv and a lot easier to see all teams where in ultimate it is impossible.” Jeff Rathburn echoed that sentiment: “despite the gr8 work you do, there is nowhere near enough coverage to make informed votes that affect bids to Nats.”

But I think that, for the most part, a wide range of regional representation would smooth out any discrepancies. Just looking at this example above, does anyone put UNI above Texas A&M? Probably not, if they’re being honest, and that might make the difference for A&M getting that last bid.

I don’t think there should be an expectation that people need to have seen every team on tape to be able to make a solid top 25 ranking. Rathburn also said that you’d need way more information about rookies playing in games, key injuries, etc. to make good decisions about relative team strengths.

While that would be great, and would possibly make a difference in rankings, I don’t think most coaches and players who vote in the BCS rankings are sitting down and watching hours of tape about teams across the country in a different conference. They’re looking at results — game scores and margins — just like the computer.

There would obviously need to be some work done to get the right structure set up to make this work. But a human element in the USAU rankings is critical to the fairest distribution of bids.

About Charlie Eisenhood

Charlie Eisenhood is the editor-in-chief of Ultiworld. You can reach him by email (charlie@ultiworld.com) or on Twitter (@ceisenhood).

View all posts by Charlie Eisenhood →

  • http://www.facebook.com/robmard Robbie Dennis

    In our defense (UNI), we tried to get into Centex and Easterns but were denied bids to both, if we could have played a stronger schedule we would have, but bid acceptance is all based on prestige which we don’t really have. I think its kind of silly that prestige is really the basis for the bid system to these power tournaments considering in college sports all it takes is one dedicated class to really turn a program around, see Miami of Florida college basketball this season for a perfect example of this.

    • http://www.facebook.com/robmard Robbie Dennis

      Also Missouri is 24th and we have played them twice this season so the statement “but they don’t have a single win — not even a single game! — against a top 25 team” is not exactly true, but we still understand your point.

      • http://www.ultiworld.com/ Charlie Eisenhood

        That was a reference to our own top 25, not USAUs, but point taken.

        I also want to say that we’re not trying to pick on you guys; it was an example to prove a larger point.

        I think program prestige is absolutely not a fair way to determine who goes to Nationals and who doesn’t. I think college should end up more structured along the lines of the Triple Crown Tour, where there are more tiers of play that have built-in crossovers.

        • PvdB

          I think a structure like the Triple Crown Tour for college ultimate would be a very, very bad idea (and it was already somewhat attempted with C1…we all saw how that went). Club teams (at least open and women’s) have much more continuity than college teams, both from one season to the next and across multiple seasons.

          I think invite-only tournaments allow for some amount of tiered play, while also still acknowledging the variability that occurs from year to year in college ultimate. The organizers of the big spring invite tournaments can see how teams perform in the fall and early spring to gauge who does and does not deserve an invite. If you’re a “new” team on the scene, then you need to take the fall seriously if you want to get into big tournaments. That’s the structure we presently have and I think it is working fine at the moment.

          I do, however, agree that the algorithm should be tweaked and I think it probably will, indeed, be tweaked between this season and next.

    • Mimmo

      It’s also based largely on Pre-season and early Spring results. Do whatever you can do to get into CCC next fall or another bigtime fall tournament and perform well to get your name out there!

      Also, bring back the NUMP!!!!

      • http://www.facebook.com/robmard Robbie Dennis

        We went to Missouri Loves Company last fall, but rather than fight for results we decided to play evenly through our roster and let younger guys experience and breed some roster depth which we lacked the season before. Should we be punished for that? If spring tournaments are going to put so much stock in the fall season then we may as well just USAU sanction the fall as well.

        • Another guest

          Most of those big tournaments include in their bids are now being accepted post that who is invited will be based largely on fall and early spring results (some even go back to how you finished up the previous season. So if you’re wanting to get into those tournaments and you didn’t play for results at your fall tournaments you kinda can’t be too terribly upset about that one. On the other hand in my opinion making your young guys better is a great way to go and probably has a lot to do with why you guys are doing so well this year, but the fact of the matter is you aren’t going to get into those big tournaments with out some results, at the time when they are taking bids.

      • anon

        The rankings for the spring only take into account CRS games. No fall games, no unsanctioned games. Do January results count? Yes but they have a smaller weight than March games.

        All teams start equal to each other on Jan 1.

    • guest

      Invite tournaments are the circle-jerks of this whole system. Teams who get the nod see almost immediate rating boost, and all they’ve really done is get invited. Invite tournaments should be thrown out if they’re going to keep the ranking algorithm the way it is.

      • Dikembe Mutombo

        if this is true than dartmouth and stanford would be getting bids over UNI. What you’re saying is opposite of what happened. I agree that Invite tournaments are very susceptible to being scheduled in ways that benefit the host team. Perhaps they should be seeded by a third party or something.

    • Another guest

      Also a season like this can be that stepping stone to getting into those kinds of tournaments. Does it kinda suck for the seniors? Yeah, but it is a step that has to be taken. The same thing happened with Ohio a few years back; they went to Easterns Qualifier and won, competed well at Easterns and I believe were invited to Easterns the following two years (Although they still played in the qualifier both years). As for teams running these tournaments I can totally understand why they want to invite the teams that have proven year in and year out that they compete at the highest level, lets not forget that those teams running the tournaments are trying to get some high level experience too.

  • Guest

    You’re pointing to one data point which marginally supports your case. I could easily imagine years where the bubble teams are not clear cut whatsoever, and opinions are widely mixed. If I know anything about the Ultimate community, I know that they love to complain about rankings. The days leading up to a human vote would be a squeaky-wheel-off to see who can gather the grease.

    You claim that the human voters wouldn’t be looking at the footage, or considering other factors, and instead would just be looking at the schedule, record, and margins – “just like the computer.” This is a deterministic process and can be modeled… by an algorithm!

    I think what you’re looking for is an updated or new algorithm, but you don’t care to articulate the mathematics of why USAU’s currently-implemented algorithm fell short in this particular case.

    I’d be curious to see you leave your higher-level ideas like “strength of schedule”/ reference to top-25 ranked teams, and have you try your hand at a rankings algorithm. What new features of the data would you use to “smooth out any discrepancies?”

    • http://www.ultiworld.com/ Charlie Eisenhood

      No, no, I think there should be some subjectivity still there. I think everyone is going to have different opinions about which team is #1, which team should make the cut, etc. based on what they’ve seen, their biases, etc. Obviously people will disagree, which is why they complain now.

      I think that an averaged pool of ballots would end up getting a very good set of results, which should then be combined with a computer ranking to get a final set of USAU rankings.

      It’s not the fault of USAU’s algorithm. No matter how you build one, it will be difficult to get everything perfectly in sync with a more subjective ranking just because teams don’t have head-to-heads against very many others.

      We have thought about building our own algorithm that tries to improve on USAU’s. It’s a project we will likely get to within the year.

      • http://www.facebook.com/rthompson Ryan Thompson

        You have to combine making a better algorithm (not hard) with a good bid allocation system (slightly more difficult to agree on). This strict cutoff at #20 vs #21 isn’t the best way to do it when there’s so much uncertainty, and you’re assigning bids to regions, not teams.

        If a region has #11, #21, #22, #23, do they deserve a second bid more than a region with #1, #20, #40? Most people would say yes.

        • http://www.ultiworld.com/ Charlie Eisenhood

          Agreed on that. The bright line cutoff is what creates the problem, in some sense.

          You should come on our podcast and talk about it with us next week. Email me.

          • FullFieldHammer

            I think the fact that bids are assigned to regions and not teams helps alleviate the problems of having a strict cutoff. “Your region didn’t get you an extra bid, but since you are such a good team, you could go try to win it,” is how I see it.

            Not to get too far off base here, but I could also could see this going to a question of what Nationals is trying to do. Are we trying to get the 20 best, most talented teams or the 20 teams having the best season’s AKA best results? They aren’t always the same. If it is the former, you disregard a rainy SB Invite, you disregard teams beating teams without their best players (whether to U23 tryouts, wedding obligations, trips to Japan). In my mind, a human element is more in that direction.

          • guest

            The fact that bids are assigned to regions makes it LESS reasonable that there is a strict cutoff, not more. If team #18 is rated a couple points higher than team #19, why should the next-place team in team #18′s region benefit? There should be a “regional strength” calculation for the last few bids, not just comparisons of individual teams. This would have the side benefit of making regionals tournaments more competitive – a bias towards regions with a lot of teams that have a shot in the last game to go, rather than regions that have n bids but only n teams that are potentially nationals caliber.

          • FullFieldHammer

            That’s fair and I’ve heard this idea before – calculating regional strength based on it is as a whole rather than the top few teams in it. The counter argument is that only the top end is going to Nationals and you could hurt a top heavy region.

          • http://www.facebook.com/rthompson Ryan Thompson

            “… calculating regional strength based on it is as a whole rather than the top few teams in it. The counter argument is that only the top end is going to Nationals and you could hurt a top heavy region.”

            That’s not how an “expected teams in the top 20″ system would work – teams with poor rankings have no effect on their region’s number of bids (probability distributions, etc).

          • Steve

            That brings to mind another benefit of regional strength bids – rankings matter for more teams. Right now only teams that are close to the top 20 really care about their ranking. With regional strength calculations, many more teams will care. This also diminishes the incentive for teams to manipulate the ranking. Right now, if teams A and B are in the same region, A ranked around #20 and B ranked around #30, B has every incentive to not play hard against A, since B is unlikely to jump up and get a bid. However, with regional strength calculations, both rankings matter to at least a modest extent, so B won’t want its rating to go down just to make A’s go up.

  • FullFieldHammer

    This is one we disagree on.

    You realize the BCS system of voting is one of the most complained about and hated systems in sports, right? For years, people have been clamoring for modifications to a playoff format. Coaches rarely vote, giving the right to assistant’s, and who knows what information they are working of off. You get into situations where coaches are incentivized to vote for their regional opponents or out of region opponents they have beaten. You get media biased by who they cover and who they see, and coaches the same way. A coach might not even mean to, but because they see, gameplan for, and discuss the teams from their region, it informs their opinion.

    Footage will matter. People will be influenced by watching one good game on Nexgen or whatnot.

    There will always be teams that feel slighted by the algorithm. The answer is always to win.

    There are probably tweaks that should be made to the math. I can agree there. But the “make it all about strength of schedule” argument is the one that keeps the established teams in power and takes opportunities from others. Teams in strong regions will never have to travel, because they can just go lose games to the strong teams near them and expect a higher ranking than teams winning games elsewhere.

    The system isn’t perfect, but let’s not overreact.

    • http://www.ultiworld.com/ Charlie Eisenhood

      OK, it’s kind of a strawman to say that the BCS is bad therefore this idea is bad. The BCS is bad because they don’t have a playoff, not because the rankings are inherently bad.

      My point is that a human element is going to marginally improve the system. Nothing will be “perfect,” but I think it’s pretty clear that the USAU algorithm has some deficiencies.

      I’m also not denying that there will be subjective rankings. Yes, people will be influenced by things like a good game, or a team they know well in their region, etc. But that stuff will balance out.

      And, again, I’m not saying make it all about strength of schedule. But at least we should CONSIDER that like they do in other sports. We currently do not, at all.

      And, remember, it’s only the marginal decisions that matter: there will generally be human and computer agreement on most of the top 20. It’s those last few spots where I think a human element would help get to the most fair outcome.

      • FullFieldHammer

        I wasn’t saying “the BCS is bad, therefor this idea is bad.” I was saying “The BCS has flaws that have lead to a strong dislike for it, and those flaws would exist were we to import that system.”

        Strength of schedule IS currently included, is it not? Wisconsin is 17-12, ranked a spot ahead of Virginia who is 21-6. Stanford is 7-12 and is ahead of Florida State at 18-8. So saying “we currently do not, at all” is not accurate.

        I think the system does a decent job of balancing SoS vs. W/L, but there are probably tweaks that could be made. I’d rather have people reexamine the math than put it in the easily influenced minds of people. I think there are far too many things that could go wrong and we wouldn’t really end up in a much better place – some teams will still feel robbed, some people will still disagree.

      • The Boneweasel

        “That stuff will balance out.”

        Sure, at the expense of certain teams. Subjectivity is a much worse problem than a computer ranking system, in my opinion. At least with a computer, you know the rules and what you have to do to succeed. With god-knows-who at the helm of the “human element,” you’re looking at a lot more controversy and confusion.

        Also, The BCS rankings are, in fact, inherently flawed.

      • anon

        False, who you play matters. And who they play matters. Every game results in a movement in the power ratings. There is a strength of schedule component in there. If a team plays a team rated 1500 their power rating will adjust differently than a team who played someone rated 200

  • Dan C

    The completely computer generated algorithm has proven frustrating on the DIII level as well, especially with a general lack of non-region play in the Metro East and New England division. Do they deserve 3 bids a piece when hardly any of those teams have played outside their regions? I disagree. Looking at last years Nationals results the NE and ME region teams placed T-7th, T-9th, T-11, and T-13 where as NC and NW teams (who both faced more diverse competition throughout the season) placed 1st, 2nd, T-5th, T-5th, T-7th, and T-9th. I guess more than anything I’m a NC player who is disgruntled that the region is currently slated to receive just 1 bid when they have 3 very deserving teams in GOP, St. John’s, and St. Olaf (teams who have all played many non-regional games) while a two seemingly less deserving regions (in terms of non-regional play) in ME and NE are earning several bids a piece. Maybe similar results at Nationals this year will sway USA Ultimate to make changes to the bid allocation system. Here’s to hoping…

  • Mike d

    I think a human element in this process would add more uncertainty at this point. There simply has to be more data before any out of region coach/panel member can make a truly informed decision. I’ve seen dozens of teams play this season, but can hardly differentiate the top 10, let alone the balance of the top 25.

    Often times the human element will discount the unknown based on history and this seems to be the trend in rating ultimate teams too. Who is to say that Uni does not deserve the bid they are currently earning? I feel, based on the score reporter results and sources, that they do not deserve it, but I am not confident in that opinion. There was a similar argument made about the north central getting 4 bids to nationals in 2011 and 3 of those 4 teams made semi’s. If popular opinion was added, there is a possibility one of those teams would’ve been shut out. A computer model certainly isn’t perfect, but at this point in time, I’d take it over regional opinion.

    It definitely bother’s me when I see a seemingly “unworthy” team in the top 20, but I think the model can be made slightly better in a couple ways:

    1. Raise the limit to earn bids to 15 games. It’s understood that the weather in the midwest and east coast this year has not been friendly to this notion, but if you have national aspirations, your team should make the sacrifice of driving or flying distances to get to warm tournaments.

    2. Put in a diversity clause in tournaments. Right now, there is no check in place to stop any region from hosting a local tournament/round robin at the end of the year to try to game bids. If you add a clause that says X% of any tournament/round robin has to come from out of region in the last 2 weekends of the year, it solves this gaming problem. Additionally, it will cause more regional smoothing for the model at the most crucial point of the year.

Find us on Twitter
Find us on Facebook