Ultiworld globe logo

Incentives Matter

One of the things they drill into your head in college if you study economics is that incentives matter.

That is why I can say confidently today: USA Ultimate has to change their ranking system. It is fundamentally flawed.

This is not going to be an argument for why any particular team should get a strength bid for their Region (preliminary final USA Ultimate rankings are posted for Open and Women’s). Frankly, there will always be complaints about any ranking system, as the teams that finish on the bubble will probably have a case as to why they should have earned a bid.

Instead, I want to focus on the core problem with the system: it gives teams incentives to throw games and, potentially, falsify rosters.

We have already seen the issue crop up multiple times. We saw a situation at the New England Open last weekend where Harvard and Tufts had an incentive to let Dartmouth win in order to boost their shot at a third bid for the NE region. (For what it’s worth, both Harvard and Tufts played to win and played hard. They did not soft pedal their losses.)

During the last club season, Hot Metal sanctioned a last minute scrimmage against Regional rival Green Means Go that helped them earn the final strength bid, an even more questionable situation.

Call these conspiracies if you like, but there is no question: late in the season, regional teams have an incentive to help each other earn bids. That makes no sense, and it needs to change.

Let’s look again at this college season: Stanford now sits with the final strength bid after two of their worst early season losses were disqualified because the teams they played — UC San Diego and British Columbia — failed a roster comparison check. Will UCSD be in a hurry to fix what may have been a clerical error? Very unlikely, since they might end up costing their region a bid by making it right.

Regardless of how you feel about the Southwest earning two bids, it seems flat out crazy that we have a situation like this even arising.

Is the issue here the ranking algorithm? No, not really. While I think there are some problems with the algorithm itself, I think we should focus more on altering the way bids are allocated, either by shifting away from a bright-line bid cutoff or by including a human element in the rankings (possibly both).

But we have to end a situation where players and teams have an incentive to game the rankings. It is so obviously faulty that it can’t be allowed to continue this way for even one more season.

About Charlie Eisenhood

Charlie Eisenhood is the editor-in-chief of Ultiworld. You can reach him by email (charlie@ultiworld.com) or on Twitter (@ceisenhood).

View all posts by Charlie Eisenhood →

  • llimllib

    Absolutely the issue here is the ranking system. It is completely possible to design one that never incentive a team to lose, such algorithms have been posted on RSD already. Fix the ranking system.

    • Top25 works.

      No, it isn’t. No one has mentioned this, or nobody wants to say it: THE RATING SYSTEM
      WORKED. Okay, we can complain about how “Connecticut and Michigan are
      weak! Great Lakes and Metro East shouldn’t get bids!!”, but the fact is
      that there are ZERO TEAMS in the top 18 who don’t deserve to go to
      nationals. Although Stanford, TAMU, and Dartmouth all have marquee wins
      (UNCW, UCF, and Colorado, respectively), Stanford has head-to-head wins over the other two. Yes,
      we should make the algorithm better. But let’s not pretend that we can
      change the algorithm and everything will solve itself. It won’t happen.

      • llimllib

        The algorithm can and should be changed such that it NEVER incentivizes a team to lose. When the algorithm incentivizes losing (which it does) unnecessarily (which it is), it did not work and should be changed.

        Yelling does not change this fact.

        • guest

          where exactly (using math not guessing or insinuation) would/does/did the system benefit a team by losing? Is there an example where a team intentionally lost to another team that resulted in a change of bid allocations? this year? last year? two years ago? Where are these games that are being thrown?

          • anon

            Any time a team far from the bid cutoff plays an in-region team close to the bid cutoff, the first team has some amount of incentive to lose. This could be a better team that has its bid locked up already, or a worse team that has no shot. You could mitigate the problem by using some measure of regional strength rather than a sharp cutoff, but there will still be edge cases.

            If he’s got an algorithm that removes any possible incentive to throw games, I’d love to hear it.

          • http://www.facebook.com/rthompson Ryan Thompson

            Harvard vs Tufts in the finals of New England Open, 2011. Got the region a second strength bid.

          • mottsauce

            You’re thinking of One Nightstand. Neither Harvard nor Tufts played in the first NEO.
            http://scores.usaultimate.org/scores2011/#college-open/tournament/8733
            http://scores.usaultimate.org/scores2011/#college-open/tournament/9059
            You can also read Bryan Jones’ analysis of that game and how it influenced the rankings here: http://skydmagazine.com/2012/03/gaming-the-rankings/

          • logic

            I can only come up with one way to eliminate (rather than just reduce) the possibility that doing more more poorly against an in-region opponent could help your region get another bid: Don’t count intra-region games in determining regional bid allocations (sounds obvious). In every other system I have seen or imagined, the requirement of any fair system to have some a nonlinearity means that a team far from the steep portion of the distribution (e.g., cutoff) can improve their regions bid prospects by doing more poorly (not just losing) to a team closer to the cutoff (or high sensitivity region). This is true of any rational “expected top 20″ algorithm including the “top 30″ method and other approaches discussed on RSD thread last year.

        • http://www.facebook.com/rthompson Ryan Thompson

          Sorry, rankings only incentivize teams to lose because the *bid allocation system* has a sharp cutoff at team #20 (give or take a few straggler regions). If you gave bids to Nationals directly to the top 20 teams, no team would ever have an incentive to lose, even with the ranking system staying the same. It’s bid allocation that should change, not the rankings.

          (although the rankings are flawed for numerous other reasons and should be changed because of those, this is a bid allocation problem)

          • logic

            I am certainly in favor of an “expected number of top 20 teams” approach to bids as suggested by Ryan (and actually developed a spreadsheet for one last night as a way to procrastinate). However, I also realized that any fair bid system system based on this approach is nonlinear and thus can greatly reduce but not eliminate the possible incentive for teams to lose (or just do more poorly) against teams in their region. As example, consider a region with teams ranked 2, 20 and 40. The #2 ranked team almost certainly should be in the top 20, while the 40th ranked team almost certainly should not (and a fair bid system would recognize this). Thus, there is an advantage to both of these teams to do more poorly when they play the #20 ranked team (or any team in their region nearer the cutoff than they are). Such a result might drop the #2 team to #3 or drop the #40 team to #45, but their likelihoods of being in the top 20 remain near 100% and 0%, Meanwhile the #20 team might rise to #17, increasing its likelihood from 50% to 60%. This is of course much better than the current sharp cutoff in which the assumed likelihood rises from 0 to 100%.

          • http://www.facebook.com/rthompson Ryan Thompson

            That’s not exactly how it should work – rank number wouldn’t matter nearly as much as rating points, since that’s what you center your distributions around. Also, we’d need a good standard deviation for each team – in a narrow distribution, what you’re describing is more true than in a wide distribution.

          • logic

            My bid algorithm actually uses rating points, but I used rankings in the post thinking those numbers would make more immediate sense to everyone. The appropriate standard deviation to use can be estimated deterministically based on the success of the previous year’s final rankings on predicting regionals/nationals results.

          • llimllib

            Ryan, it turns out that you were part of the team that wrote the post that I was thinking about when I initially commented! It took me forever to find the thread on rsd that I remembered from more than a year ago: http://www.rsdnospam.com/index.php?t=msg&goto=111789&&srch=algorithm#msg_111789

            And by “algorithm” I meant the whole process of rankings and bid allocation, which is of course itself an algorithm. Your proposed ranking and bid allocation system is exactly what I would like to see, and I defer to you on all matters related to the rankings.

  • Al C.

    Speaking of incentives, why should USAU be motivated to change the system?

    • Jacob Turner

      Because we pay them to when we sign up for USAU memberships.

      • Brian

        You pay them to change the algorithm?

      • Al C.

        The purpose of the rankings is to help ensure most of the best teams are at Nationals, not all the best teams. Hence why each region is guaranteed one bid. Will the 3rd team from the NE or the 5th team from the NC win Nationals? Never, so it’s really not a big deal.

        • anon

          Well put. I don’t think there is any sport that simply includes the “best teams” in its postseason. March madness doesn’t include the top 64 teams. NFL, NBA and MLB, NHL playoffs: There systems aren’t designed to include the best teams. That’s the beauty of sports in general…giving an opportunity to the 3rd team in the NE, or the 5th team from the NC to make a run.

        • Nathan

          Saying it’s never going to happen is a stretch. We’ve already seen a team finish third in the NC and win Nationals.

  • Karl

    It’s interesting to me that the ultimate community places such an emphasis on spirit and sportsmanship, except when it comes to manipulating and gaming the system as described above. Granted the USAU ranking system isn’t perfect (any system that uses a mathematical ranking formula can’t be), most of the blame for these cases should fall upon the teams who throw games, falsify rosters, or take similar actions that defy the spirit of the rules.

    • http://www.ultiworld.com/ Charlie Eisenhood

      Just as in games where people cheat by abusing the self-officiation rules, incentives matter.

      • Karl

        So then you are also advocating for USAU to change the self-officiation aspect of the sport?

        • http://www.ultiworld.com/ Charlie Eisenhood

          As a side note, I’m all for observers whenever possible.

          But the issue here is: why would you choose a bid allocation system that can be easily gamed? There are so many better ways that would remove that component.

          I am not saying we need to blame USAU for anything; I’m saying we need to stop making it possible for people to cheat the system.

          Saying we have to rely on SOTG to solve everything doesn’t make any sense. We wouldn’t need observers if that was the case. People cheat, we should do everything we can to minimize it.

    • Annoyed

      I was thinking the same thing. Spirit of the Game should rule here so USAU doesn’t need to change anything, right? SOTG should make it obvious that you play to win even if it doesn’t serve your team best, right?

  • Guest

    I’ve seen your rankings and I’d prefer not to use the “human element” thank you very much.

    That said, maybe a 1 bid per region, 6-8 team strength bids, and then 2-4 regional strength bids would probably make some sense where regional strength bids are allocated based on the rankings of the 17/19 – 30ish best teams. Having multiple teams in that range acts as a multiplier for your regional strength.

    This would do three things – all positive:

    1. Lessen (although not eliminate) the misaligned incentive you mention (since the cut-off line isn’t quite as bright)

    2. Benefit regions with multiple borderline teams vying for a bid instead of regions with one borderline team.

    3. Make for more exciting regional tournaments where the last bids in a region is more likely to go to closely matched teams.

    • Guest

      Let’s describe this further. Let’s go with 1 bid/region. 8 team strength bids, and 2 Regional strength bids.

      Regional Strength bids are awarded to the team with the lowest Regional Rank.

      Individual Ranks = Rank – 17 (so the team ranked 18th would have a rank of 1

      N = Number of teams from Region ranked 18-30.

      Regional Rank = ((Sum of Individual Ranks for all teams in Region ranked 18-30) – 12(N-1))/N

      Tie Break goes to the team with the most teams in rank (e.g. highest N). 2nd tie break goes to region with the highest ranked team.

      Once one regional strength bid is awarded the top ranked team from that region is removed from the formula and it repeats (this allows a very strong region to receive both bids).

      If this were applied for the open division this year (with the current USAU rankings) bids would be awarded to SC and then SW (both have -1, SW has 3 teams in 18-30). So it would move the AC bid to the SW.

      If this were to applied to the women’s division this year, bids would still be awarded to the AC and SW (-4 and -2).

      While this could probably be tweaked to be made better, it isn’t bad and achieves the objectives listed in the above post.

      • guest

        So regional strength is only based on the top 30 teams nationally? What if teams 1 and 2 are from the same region and then teams 40 and 41 are also? Does this mean that they’re less strong than a region that has the 18th team, 25th, 29th and 30th? Average rank is 21 vs 25. If using regional strength why not use the entire region?

        • Guest

          Because we are still trying to get the top teams at Nationals as much as possible. 30th is certainly an arbitrary cut-off. That could be flexible. As could the weighting of teams (e.g. 18th could carry relatively more or less weight to 19th as could having 2 vs. 3 teams in the range that we are comparing). But I think there’s probably consensus that we do not need to talk about the 40th best team when considering who should receive one of the 20 nationals bids.

          For your comparison, the first region (1, 2, 40, 41) would receive two bids (an auto bid and then a team strength bid for team 2).

          Your other region would receive an auto bid (which counts for team 18) and then 25, 29, and 30 would be considered against the other regions for a strength bid – in the allocation suggested above it is highly unlikely that they would earn it but possible (I believe they could get the second regional strength bid iff every region had a team in the top 18 AND every other region only placed on team in the 19-30 range).

          • guest

            there’s also probably consensus we don’t need to talk about the 30th ranked team so why are we considering them?

          • Guest

            Personally, I think a 30th vs 19th ranked game-to-go to nationals is pretty compelling and the difference between the two isn’t much. But like I said – it’s flexible. Where’s the cut-off? 25? If it’s 25, just change the formula slightly and you can get the same thing. Of course in that sencario it starts becoming more likely that just the 19th and 20th ranked teams get in – which is the scenario we are in now.

            30th provides for more of a regional strength bid over a team strength bid.

            Anyway, if you don’t like it, go ahead and propose something different.

          • guest

            but that’s the problem – there isn’t a 19th v 30th game to go or a 20 v 21 or anything else so anytime another layer is added (regional strength) a) you add subjectivity b) there’s more room for flaws. The same could be said of a 30th v 40th game to go. Is texas A&M v Kansas and “better” than Kansas v Penn St? At least the second one is between different regions.

            Why add a subjective element such as at ‘x’ rank these teams aren’t “good enough to factor”?

          • guest

            What in the world are you talking about?

            Charlie raised an issue in his article: the current set-up creates an incentive for some teams to lose in certain situations.

            The solution I proposed would lessen that incentive. It would reduce the “bright line” of the cut-off by giving some more weight to a region’s depth up to a certain point.

            No one is talking about the 30th vs 40th. 21st vs. 30th is relevant because in the system I proposed a region with the 21st and 30th ranked teams could “earn a bid” over the region with the 20th best team and so the 21st vs 30th could be a real regional game-to-go to Nationals.

            Right now there is an “x” rank – it’s 20 (if every region has at least one team in the top 20). A lot of people (myself included) don’t think it’s satisfactory for reasons Charlie pointed out and other reasons.

            If you’re going to discuss this at least be cogent. If there are flaws, point them out so that we can discuss them.

          • guest

            YOU: “I think a 19v30 game to go to Nationals is compelling”
            ME: “there’s no such thing and it’s only compelling because you decided arbitrarily that 30 should be a cutoff for consideration. Why add subjective elements?”

            Again – why not include all teams to determine regional strength? If you draw a line somewhere whether “hard” at 20 or “soft” at 30 you have fixed your claimed problem

          • Guest

            Give an actual example of how you include all teams and then let’s look at problems with that example.

            If what you are suggesting instead is Ryan’s proposed “expected number of top 20 teams” which, in theory, considers all teams from a region, I agree that that’s a better solution than what I’m proposing. I’d still like to see an example.

  • FD88

    In the Olympics last summer, four women’s badminton teams were disqualified for intentionally throwing games in order to play against weaker opponents in the quarterfinals. There should rarely be an incentive to throw a game or forfeit one (see: Whitman last year) in order to game the ranking system. Let’s hope USAU has the sense to change the ranking system to deal with this problem.

    http://articles.latimes.com/2012/aug/01/sports/la-sp-oly-spirit-of-games-20120802

    • anon

      Or, if changing the ranking system isn’t an option, USAU has the power to simply disqualify teams from participating in the series if they are caught throwing games. That is another simple solution that should be considered.

      • FD88

        That is, to be perfectly frank, a terrible idea. Teams should not be given incentive to throw or forfeit games, which means that your ranking should not go DOWN simply by stepping on the field with a lower-ranked team. The whole idea of DQing a team because they “threw a game” is as ludicrous as it was in London this summer. What qualifies as “throwing a game”? Does resting your starters against a much greater opponent (see: Rhino 2011) or after having clinched the one-seed in a pool count? Is it trying to play as poorly as you can (Olympics)? The rankings system is broken––it would be adding insult to injury to DQ a team for trying to game it for another bid.

        • Vance

          Purposefully losing a game shouldn’t be grounds for disqualification, especially if you’re doing it to manipulate the system? Are you kidding me?

          • mottsauce

            No, it shouldn’t. You’ll put out your rookies/new guys for development purposes, and this overall helps your team and provides depth. Forfeiting games a la Whitman to protect your ranking should be prevented, but choosing to not play a game at full strength should not be.

            This is especially true during the series. You want a bid to regionals/nationals. Are you really trying to argue you should wear out your starters against a far better team that you have no shot of beating? So you’re saying that Minnesota-Duluth should be disqualified for resting their starters against Iowa and Wisconsin? Really?

          • Vance

            Are you responding to me? Where did I say anything about having an issue with resting starters. There is a monumental difference between resting your starters and purposefully losing a game.

          • mottsauce

            If you’re Vance, yes, I am responding to you.
            There is effectively no difference between resting your starters and purposefully losing. There is, however, a difference between these two and forfeiting games (I think we agree on that).
            The “morality” behind the decision, if that’s what you feel like calling it, has no effect on the result.

        • SV

          Your ranking doesn’t go down if you step on the field with a lower ranked opponent. This changed from last year.

  • Cody

    In regards to bid allocation why not do something similar to that of the NCAA basketball tournament. Each region has one automatic bid that goes to the winner of its regional tournament. Then after that the top 8 remaining teams in the rankings are given bids to nationals. In addition have sectional and regional games count towards the rankings algorithm so that even the top ranked teams still have an incentive to win at regionals.

Find us on Twitter
Find us on Facebook